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Abstract 


Purpose — This paper offers a practical insight into the application of Lotka’s law of author 
productivity to the question of how likely it is that an author will return to a particular publisher 
(rather than make another contribution to a subject literature, which is its usual application). 
The question of author loyalty, especially repeat visits, is one which is of great interest to 
publishers. 

Design/methodology/approach — This paper shows, possibly for the first time, that the author 
productivity distribution predicted by Lotka’s law for subject literatures also holds for publisher 
aggregates, in this case, all Emerald authors. 

Findings — The ideas presented here are speculative and programmatic: they raise questions and 
provide a robust intellectual framework for further research into the determinants of author loyalty, as 
seen from the publisher side. 

Practical implications — The implications for commissioning editors and marketing departments 
in journal publishing houses are that repeat visiting authors are indeed scarce commodities, not 
necessarily because of barriers put in their way by publishers, but because research production is very 
asymmetrically skewed in favour of a small productive élite. 


Originality/value — By analysing survey data it should be possible, within very broad parameters, 


_ to identify clusters of say high, medium and low research activity authors. This would provide insight 


into potential “hot spots” of future publishing intent and, in the case of dense and overworked research 
areas, early warning as to when to start looking elsewhere for future articles. 


Keywords Research results, Brand loyalty, Publishing 
Paper type General review 


Background 

In May last year, John Peters, an Emerald Director, broadcast an e-mail to members of 
the Literati Club containing facts and figures drawn from their author database, as of 
April 2003. The data comprised the numbers of times that individual authors had 
published with Emerald and the purpose of the message was to stimulate discussion on Emerald 
an issue of strategic commercial significance for any publisher: the question of author 
loyalty. 
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Of these: 

13,428 (65 per cent) have written once only for us 
7,216 have written more than once are repeat authors (35 per cent) 

Of these 7,216 repeat authors: 

3, 327 have written twice 16 per cent 
1,437 have written three times 7 per cent 
782 have written four times 4 per cent 
487 have written five times 
297 have written six times 
218 have written seven times 
149 have written eight times 
99 have written nine times 
95 have written ten times 
79 have written 11 times 
45 have written 12 times 
37 have written 13 times 
31 have written 14 times 
30 have written 15 times 

110 have written 16 times or more. This includes some freelances. 

418 (2 per cent) have written ten times or more, and 1,668 (8 per cent) five times or more. 
These are our most “loyal” authors. 

A handful of authors have been published 100 times or more by us. These are largely 
people who have co-authored papers with, e.g. research students. 

In almost every sphere of business, existing (satisfied) customers are an organisation’s 
most likely future customers. It is easier to sell to a (satisfied) current customer than to a 
non-customer. 

I hope you find these figures and the stories behind them interesting. If any thoughts or 
actions occur to you, please feel free to share them with me. 

John Peters, Director, Emerald/MCB. 


In his message, John makes the obvious connection that authors who publish 
frequently with a particular journal or publisher might be said to display a degree of 
brand loyalty. Repeat authors are of particular interest to publishers because as 
existing and demonstrably satisfied customers, they may reasonably be expected to 
submit manuscripts in the future, thus securing an editor’s access to a steady flow of 
research findings. From our knowledge of consumer behaviour in other areas of life, 
we can predict that loyal authors are less likely to switch brands. Even better, they 
deliver indirect benefits to the journal publisher, such as an advocacy role within the 
academic world, encouraging their research students and peers to consider 
publishing with Emerald and their library committees to subscribe to their 
offerings. 

Although not stated explicitly, the question being presented by John Peters was “How 
can Emerald manage its author relationships to encourage more repeat business?” In 
other words, how can it shift the frequency distribution above to its advantage? 

As a bibliometrician, my immediate impression of the data was that the author 
distribution appeared to comply with Lotka’s “law”. This short communication 
explains that law, its applicability to the data, some immediate practical consequences 
for Emerald and other publishers, and it reflects on some issues for further 
investigation. 








What is Lotka’s law? 

Alfred Lotka investigated author productivity and modelled it mathematically during 
the last century (Lotka, 1926). Essentially, his law describes the very regular patterns 
that are seen in subject bibliographies when authors are listed by the number of times 
they have published. 


A definition of Lotka’s law 


Lotka’s law describes the frequency of publication by authors in a given field. It states that 
“the number (of authors) making n contributions is about 1/n? of those making one; and the 
proportion of all contributors that make a single contribution is in the region of 60 per cent” 
(Lotka, 1926, cited in Potter (1988)). This means that out of all the authors in a given field, 60 
per cent will have just one publication; 15 per cent will have two publications (1/22 times 
0.60); 7 per cent will have three publications (1/3? times 0.60), and so on. 

More generally, the law takes the form yx = c/x” where y, is the number of authors 
credited with x(1,2,3, . . .) papers, c is the number of authors contributing one paper, and 7 is a 
rate (usually n = 2). 


_ The law has been found to be robust and pretty well universal in its applicability, 
extending beyond the world of scholarly publishing to even describe the productivity 
of software developers in open source systems (Newby et al., 2003). With the isolated 
exception of a study on Dutch high-energy physics (Kretschmer and Rousseau, 2001), 
where typically more than 100 authors are recorded on each paper, Lotka’s law appears 
to be a highly resilient and structural feature of intellectual productivity across many 
different fields. Usually presented as a way of describing author contributions to a 
literature, this phenomenon may be of interest to journal publishers if we overlay 
commercial concepts of brand loyalty and repeat business onto the more neutral 
language of bibliometrics. 

The data supplied by Emerald in fact fits Lotka’s law very closely (Table I), despite 
its comprising a number of joint subject bibliographies, notably in business and the 
information sciences. (In this analysis, no fractional weighting was applied for 
co-authors, and so all author contributions were counted as one.) 

There are perhaps slightly more first time and slightly fewer very highly productive 
authcrs than might have been expected in a “classic” literature, but this is likely to be 
explained by the relatively small time window to which the data refers (and the jury is 
still out as to whether Lotka’s model tends to overestimate very highly productive 
authors in any case). 

Its hybrid nature makes understanding the dynamics of the authorship data rather 
difficult. We might refer to this as the “art gallery problem”. Emerald’s one-time 
authors may in fact be prolific writers who are visiting for the first time. Furthermore, 
they may be highly productive in fields remote from Emerald’s scope and coverage 
(perhaps a learned statistician publishing for the first and only time in the information 
science literature). They are not at all necessarily the “natural losers” implied by 
Lotka’s model, marooned at the bottom rung of the productivity distribution. 

Within the specific snapshot that we are considering here, there will be many other 
diffe-ences between authors: academics or practitioners; people in early, mid- or late 
publishing careers; people with positive or negative attitudes to Emerald. Like people 
slowly progressing through a popular art exhibition, a later snapshot will comprise a 
different population mix: some will still be there; others will have left or entered the 
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Table I. 


Emerald data and Lotka’s 


predictions (%) 


$ 














Author repeats Lotka’s model ciber’s best ft? Emerald actuals - 
0 61.0 65.0 65.0 
1 15.3 15.5 16.1 
2 68 6.7 70 
3 3.8 3.7 3.8 
4 24 2.3 2.4 
5 17 16 1.4 
6 12 12 1.1 
7 1.0 0.9 0.7 
8 0.8 0.7 0.5 
9 0.6 0.6 0.5 
10 05 0.5 0.4 
11 04 0.4 0.2 
12 04 0.3 0.2 
13 0.3 0.3 0.2 
14 0.3 0.2 0.1 
15+ 3.6 0.1 05 
100.0 100.0 100.0 


Notes: *ciber modelled the data using = 2.065, c = 0.65 


gallery. Not good news for exhaustive bibliometric research under controlled 
conditions. 

However, given the apparent fit of the data with one of bibliometrics most 
established laws, we would expect a very similar pattern to obtain next month, next 
year, in five years’ time, and nothing that Emerald can do will shift this in any 
significant way (other than by moving into Dutch high energy physics). We would 
expect the proportion of single visitors to gradually decrease towards 61 per cent as the 
Emerald brand matures but, essentially, at any one point in time, Emerald can expect a 
rough 60/40 split between single and return visitors. The difficulty facing Emerald is 
that very few authors are consistently highly productive. 

First time authors are important because they provide the majority of Emerald’s 
articles, and within this group, some may go on to become more frequent visitors and 
are therefore worth cultivating. Frequently returning authors are important too, for 
other reasons, but the main point of this paper is that their proportion will always be 
relatively small in terms of overall production. 


Why is author production patterned like this? 

Why are some scholars more productive than others? The first thing to say about 
Lotka’s law is that it may well describe but it does not explain what is going on: the 
underlying human behaviour that gives rise to these patterns is far from being fully 
understood. There is pretty good evidence that frequency of publication correlates 
significantly with frequency of citation and professional reputation (Merton, 1988) and 
that being part of a stimulating, privileged intellectual environment is a necessary 
condition for being productive. This line of argument stresses that, independent of 
talent, authors require the right conditions to become productive: they need the 
confidence that feeds on success, access to research grants, freedom from teaching and 
administration, the esteem of their peers, access to specialist equipment, the 





stimulation of teams of fellow researchers, and a supportive and well managed 
research culture (David, 1994; Bozeman and Lee, 2003; von Tuzelmann et al, 2003). 
These resources are all in scarce supply, and because publishing itself carries certain 
rewards (ike credibility, standing), then there is a virtuous circle whereby these 
necessary resources flow disproportionately to those that publish more. But since 
competition for resources is so tough, only a few manage to break away from the rest of 
the pack. This “success breeds success” phenomenon or “Matthew Effect” was 
certainly understood in principle, albeit in a different context 2,000 years ago as 
illustrated in a famous Biblical passage: 


For whosoever hath, to him shall be given, and he shall have more abundance: but whosoever 
hath nat, from him shall be taken away even that he hath (Matthew xiii:7). 


This principle is embedded in UK higher education policy which is to focus money 
where it will have the most visible and immediate impact, through research selectivity 
mechanisms such as the UK Research Assessment Exercises. In this model, money and 
influence chase publications with a vengeance. 

An alternative hypothesis has been proposed by Dogan and Parhre (1990) to explain 
the highly asymmetric contribution of authors to a given literature. All things being 
equal, one would expect that the more research effort that is applied to a particular set 
of problems, the greater the number and quality of research outputs. Dogan and Pahre 
argue that this may not be the case in particularly well researched (or “dense”) subject 
areas. As more and more researchers and funds pile into a given area, the amount of 
innovative work increases, but at a decreasing rate. In this “paradox of density” 
scenario, publishing opportunities also increase at a decreasing rate as the new recruits 
to the field find that most of the pioneering work has already been achieved by those 
who originally saw the opportunity and capitalised (i.e. published) on it. 

The potential implications of these theoretical models, both for public policy and for 
publishing business development is enormous, yet sadly neglected by the research 
community itself. 


Areas for future research 

One implication of Lotka’s law is that past publication increases the probability of 
further publication, for the reasons outlined above; it is very much more likely that 
someone who has already written 49 papers will write a 50th than it is for someone 
who has published four to write their fifth. The problem, if we frame the question “who 
is most likely to publish again with Emerald?” is that we cannot say simply by looking 
at the existing data (the “art gallery” problem, compounded by the fact that there are 
other shows in town). However, Lotka’s law provides a robust general framework 
within which future propensities to publish may be assessed. 

By analysing survey data including variables such as chronological and “academic 
age”, numbers and frequency of previous publications, access to research grants, job 
status, etc. etc., it should be possible, within very broad parameters, to identify clusters 
of say high, medium and low research activity authors. This would provide Emerald 
with insights into potential “hot spots” of future publishing intent and, in the case of 
dense and overworked research areas, early warning as to when to start looking 
elsewhere for future articles. 
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A “fuzzy” approach to multi-channel 
information optimisation 


Andrew Boyd 
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London, UK 


Abstract 

Purpose - To validate the use of fuzzy control systems for information channel optimisation. 
Design/methodology/approach — The research presents findings from a multi-year case-based 
study of an international software organisation. At the outset of the study, baseline log-file data were 
collected from the organisation’s customer relationship management and financial systems. As part of 
a business process reengineering effort, a fuzzy control system model was created and implemented to 
optimise the software support communication channels. After the first year, data were recollected to 
determine the effectiveness of the model. The log-file analysis was augmented with individual 
interviews of stakeholders within the business. 

Findings — The optimisation strategy based on the fuzzy control system allowed the organisation to 
focus on answering more queries from higher value customers and cut the support resolution time 
nearly in half, in less than a year. By focusing on higher value customers and more productive 
information channels, staff efficiency increased and costs were reduced. This research indicates that 
customers using synchronous communication channels such as the telephone seem to get better 
service than those using asynchronous channels such as e-mail or web. Additionally, the research also 
indicates that several geographic factors such as proximity and language proficiency could influence 
information channel choice affecting the level of service received. 

Research limitations/implications — The case findings could be specific to the observed 
organisation or to the software service industry. Additional research is necessary to determine the 
universality of the method and ancillary findings. 

- Originality/value — Methods outlined in this case provide both practitioners and researchers with 
new tools to explore and react to the challenges of information channel proliferation. 

Keywords Fuzzy control, Information transfer, Information management, 

Customer service management, Business process re-engineering 


Paper type Case study 


Background 
The first part of this case study presented a software company that was undergoing a 
significant organisational transformation (Boyd, 2004b), With redundancies and 
several rapid changes in management, business processes were broken and procedural 
documentation was long out of date. Through the adoption of a goal-based information 
retrieval framework, the organisation was able to create and document repeatable, yet 
flexible, business processes that linked high-level business objectives to data sources. 
The organisation communicated with its customers through a number of 
information channels including phone, e-mail, web and occasionally other mediums 
such as face-to-face and fax. Using the goals, questions, indicators and measures 
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Table I. 

Number of support 
tickets opened by 
information channel 
(2002) 


(GQIM) framework as a guide, the concluding part of the case study illustrates the 
application of a fuzzy control system to optimise customer service information 
channels. This work follows-up a conceptual model presented in this journal earlier in 
the year (Boyd, 2004a). 


Methodology 

Research data were collected at the beginning of the project to assess the current 
situation and serve as a benchmark to determine the success (or failure) of the 
turnaround efforts. After the first year, data were recollected to determine the 
effectiveness of the process redesign and management efforts. Quantitative 
information was drawn from the company’s customer relationship management 
(CRM) system, financial databases and personal sources such as Excel spreadsheets. 
During this research, a total of 6,247 support tickets were analysed. Baseline data (from 
calendar year 2002) were collected in February and March of 2003. In 2002, 3,111 
support tickets were created and 2,982 tickets were closed by a team of 4-5 support 
analysts (in May 2002 one analyst transferred to another area of the business). Data for 
2003 were collected in January of 2004. In 2003, 3,136 tickets were created and 3,091 
were closed by 3-4 support analysts (in July one analyst resigned and was not 
replaced). This information was supplemented by individual interviews with the 
support manager and three support team members. All identities have been masked to 
preserve employee, customer and company confidentiality. 


Baseline situation analysis (2002) 

The organisation provided technical support through a number of information’ 
channels including phone, e-mail, web and occasionally other mediums such as 
face-to-face and fax. In 2002, 72 per cent of the support tickets were opened over the 
phone, with a majority of the remainder (26 per cent) logged via e-mail (Table D. 
Interestingly, the tickets opened by phone were closed on average seven days faster 
than the ones opened by e-mail and on average 11 days faster than the ones opened 
through a web-based self-service portal. Even during this cursory analysis of ticket 
source, it was clear that there was a difference in either the type of issue reported 
through each medium or in the team’s ability to respond to the issues. 





Status Average 
Total Closed Open days open 
E-mail 815 807 8 17 
% 26 99 1 
Other 5 5 14 
% 0 
Phone 2,226 2,186 40 10 
% 72 98 2 
Web 65 64 1 21 
% 2 98 2 
Grand total 3,111 3,062 : 49 12 
98 2 








A “fuzzy” measure of strategic value 

At the offset of the project, there was no organisational way to determine the strategic 
value of a customer. During the GQIM exercise, which was part of a business process 
engineering programme (see Boyd, 2004b), a qualitative assessment of each account 
was undertaken by the management team. Based on revenue contribution and the 
prestige of the account, each customer was assigned a 0-10 rating (Strategic Value 
Index, or SVJ). It should be noted that, since this was a qualitative assessment, the 
numerical SVI rating can only be considered an ordinal value. Since the management 
team was after an approximation or directional indicator of value, it was decided not to 
define the SVI in interval or absolute terms. Rather a flexible approach based on “fuzzy 
numbers” would be adopted and a fuzzy control system (FCS)[1] would be used to 
logically group customers according to how they were to be treated. This approach to 
developing communications strategies was previously outlined by Boyd (2004a). 

Immediately, it was clear that there was a disparity in the level of support received 
by each constituency. When the average days (that a support ticket remained) open 
(ADO) was plotted, it was revealed that the less strategic customers seemed to be 
getting a slower response time to issues (Figure 1). In Figure 1, the x-axis has several 
triangular lines protruding from it. The lines indicate the completeness of belonging to 
each value grouping. For instance, a SVI of 3 could mean a low-value or medium-value 
customer, but a 4 or 5 rating fall squarely in the medium group. Similarly, a 7 could 
belong to medium or high-value group. 

Although management was not unhappy about this finding, no specific policy 
dictated that more strategic customers should be treated differently than less strategic 
customers. During a round-table discussion, several hypotheses were put forth as the 
cause of this phenomenon: 


H1. More strategic customers were getting better service, either through internal 
(management) pressure or familiarity (the team had a more intimate working 
relationship with high-volume users). 


H2. There was a difference in the types of issues reported via the different 
mediums — i.e. high priority issues got reported by phone and low priority 
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Figure 1. 
More strategic customers 
received better service 
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Table II. 

Number of support 
tickets and customers by 
strategic value 


Table HI. 
Number of support 
tickets by place of origin 





issues were reported through asynchronous mediums such as web and 
e-mail. 


H3. Experience of the customer influenced the medium chosen and amount of 
support used. 


H4. A fourth factor such as geographic location/language influenced the 
medium of choice. 


Next, the SVI rating was cross-tabulated by the sum of tickets (Table I). Surprisingly, 
ticket usage was split between (non-fuzzy categorisations) of low (SVI = 0-2), medium 
(SVI = 3-6) and high (SVI = 7-10) value participants. That is the 16 high-value 
partners opened as many support tickets as the 152 low-value partners. 

What was unclear at this point was whether or not the slower response was due to 
an implicit service policy being granted to more valuable customers, or for some other 
reason. When a regional analysis was conducted, it was revealed that the vast majority 
of tickets originated with local customers (Table I). 

However, when the average days open was analysed by region, it was immediately 
clear that the non-UK (foreign) were getting their queries answered more slowly than 
UK-based (local) customers (Figure 2). This could have been a result of: 


Strategic value Customers Tickets % 
0 1 24 1 Low value 34% 
1 132 804 26 
2 19 245 8 n= 152 
3 20 172 6 Medium value 34% 
4 9 233 7 
5 4 122 4 
6 11 529 17 n= 44 
7 10 261 8 High value 32% 
8 3 265 9 
9 1 101 3 
10 2 355 11 n=16 
Grand total 212 3,111 100 
Tickets % 

UK 2,064 66 
France 254 8 
DACH 220 7 
Spain 141 5 
Other — Europe 93 3 
The Netherlands 66 2 
Denmark 43 1 
Russia 35 1 
Portugal 23 1 
Unknown 172 6 

3,111 100 


+ 





ADO by Geography 










PORTUGAL [m77 = 


OTHER - EUROPE an 








Averags Days Open 


* language difficulties and/or product problems relating to regional operating 
systems; 


* use of a different communication medium; or 
* as a whole these were less strategic customers. 


On the whole, language or regional settings issues seem to be a contributing factor in 
slower response time. Holland — where English is known to be spoken very well — had 
nearly the same ADO score as the UK. Also, the Italian customer who was very good 
technically and spoke English well also had a significantly lower ADO score than the 
UK customers. France, Russia, Germany (DACH), Spain and Denmark, on the other 
hand, all had ADO scores significantly higher than the UK. 

Clearly, medium also seems to influence response time. On average, an e-mail ticket 
remains open seven days longer and a web ticket stays open 11 days longer than a 
phone ticket (Table I). However, it should be noted that during round-table discussions, 
the support team said that phone tickets can sometimes be easier to close because they 
can ask for and get all of the necessary information needed to investigate the problem. 

Very interestingly, the country of origin coupled with medium also seems to be an 
influencing factor (Table IV). The UK who, by an overwhelming majority (80 per cent), 
used the phone to communicate with the support team had a significantly lower ADO 
than the nearest statistically relevant comparison (France at 66 per cent phone usage). 
Those countries that predominantly relied on e-mail and web had significantly higher 
ADO averages. 

Of the 16 high-value customers, only seven come from outside the UK (Table V). 
Therefore, it stands to reason that, in this case, “strategic value” is linked to geographic 
location. 

With the exception of web ticket usage (with an admittedly low sample size), except 
at the very highest ratings, channel choice does not seem to be significantly influenced 
by strategic value (Table VI). 

Additionally, there seems to be an inverse relationship between experience and 
average days open (Table VII). For the customers with a SVI score above four, 
experience was qualitatively assessed using a “low”, “medium” and “high” scale. Not 
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Figure 2. 

Average days a ticket 
remained open by country 
of origin 
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Table IV. 

Number of support 
tickets opened and 
average days open by 
information channel and 
country of origin 


Total? E-mail (%) Phone (%) Web (%) ADO 


UK 2,060 389 19 1,653 80 18 1 10 
Italy ' 7 2 29 5 71 5 
France 254 80 31 168 66 6 2 14 
The Netherlands 66 24 36 42 64 9 
Other — Europe 85 31 36 44 52 10 12 5 
DACH 220 105 48 113 51 2 1 15 
Portugal 23 12 52 11 48 4 
Denmark 42 20 48 16 38 6 14 26 
Spain 141 81 57 39 28 21 15 22 
Russia 35 24 69 9 26 2 6 32 
Africa 1 1 100 N/A 
Grand total 2,934 769 26 2,100 72 65 2 


Note: *Unknowns removed 


surprisingly, the more experienced customers found more difficult cases that on the 
whole take longer to close. 

In summary, it seems that in 2002 customers got better service for the following, 
interlinked, reasons: 


* Strategic value to the organisation. Customers that generated more revenue or 
had a higher “prestige value” to the organisation got better service, 


* Geographic location and language. Regardless of any correlation (although 
unsubstantiated in this research) between geography and strategic value, 
customers that were native English speakers received better service than their 
foreign counterparts. 


- Propensity to use the telephone. However, telephone usage (in contrast to e-mail or 
web) also seemed to influence the level of service. On average, telephone queries 
were answered significantly faster than other mediums. Several factors could 
account for better service provided over the telephone. First, support analysts 
could gather all of the information at once or could answer the query with one 
contact (thus eliminating delayed responses). Use of the phone is also linked to 
geographic location and language — native English speakers (or customers with 
good English skills) were more likely to use the telephone. 


* Experience of the customer. In aggregate, customers with a higher SVI receive 
better service than lower value customers. It stands to reason that more 
experienced customers have more difficult support problems. But higher 
experience levels seem to be linked to longer resolution times. 


Horses for courses: a value-based information strategy 

Given that one of the primary business objectives was to increase productivity of the 
team (Boyd, 2004b), it was clear that procedures needed to be put into place to optimise 
and control channel usage. Generally speaking, the high-value users were also volume 
users, so any strictly volume based controls would likely alienate high-value 
customers. Therefore, it was recognised that controls needed to take into consideration 
both factors. 
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Table VI. 

Number of support 
tickets opened by 
strategic value and 
information channel 


Table VII. 

Average days open 
(ADO) by country of 
origin and experience 
level 





Strategic value Total? E-mail % Phone % Web % 
0 24 9 38 15 63 
1 804 278 35 496 62 30 4 
2 244 70 29 167 68 7 3 
3 171 50 29 119 70 2 1 
4 233 24 10 207 89 2 1 
5 122 28 23 93 76 1 1 
6 528 115 22 402 76 11 2 
7 259 75 29 177 68 7 3 
8 265 63 24 197 74 5 2 
9 101 33 33 68 67 
10 355 70 20 285 80 
Grand total 3,106 815 26 2,226 72 65 2 
Note: ĉOthers removed 

Low Medium High Avg ADO 
DACH 15 40 21 
Denmark 39 39 
France 21 21 
The Netherlands 7 : 7 
Russia 3 3 
Spain 13 13 
UK 8 9 15 10 
Grand total 10 13 16 12 


Notes: Customers with unknown experience levels omitted; For customers with SVI >= 4 


As a basis for determining a way forward, a count of the number of support tickets 
(y-axis) was plotted by the Strategic Value Index (SVD score on the x-axis (Figure 3). 
Since the inputs to this exercise were qualitative, the management team did not want to 
put too much emphasis on the actual SVI score, but rather wanted to determine 
response strategies for groups of similar customers. Therefore, high-volume users were 
defined as having opened greater than 110-130 tickets, medium-volume users were 
determined as having open 35-130 tickets and low-volume users were defined as using 
0-65 tickets (notice the overlap between groupings). Similarly, high-volume users were 
defined as having an SVI score of 7-10, medium-value customers between 3-7, and low 
between 0-3 (again, notice the overlapping scores). The plot was then segmented using 
these fuzzy value scores (noted by the grey and black dotted lines). 

Using the segments as a guide, response strategies could then fairly easily be 
derived for groups of like customers. High-volume/high-value customers were given 
official priority call handling and, in some cases, named support analysts were 
assigned. However, so they could not abuse their status, contracts were reviewed and 
customers that were not in compliance were “gently” encouraged to follow agreed 
procedures. High-value, medium-to-low usage customers were left alone. 

Medium-value to medium/high-usage customers were also subject to contract 
reviews and in many cases encouraged to take additional product training (to reduce 
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future usage). Medium/low-value and medium/low-usage customers were directed to 
use e-mail and web self-help support, as well as take additional training. Finally, low 
usage/low-value customers were subject to review and contract termination (as they 
would likely remain forever unprofitable). 

This programme was rolled out in late 2002 and early 2003. Follow-up data were 
collected in January of 2004. In a year, the information channel usage landscape had 
changed significantly. Despite one less support team head, the ADO for phone tickets 
had been cut in half (from ten days to five) and the total ADO for all mediums had 
dropped from 12 to seven days (Table VII). 

E-mail ADO remained relatively the same (17 in 2002 vs. 15 in 2003), but web usage 
had also dropped significantly, as this channel option had been discontinued. 

The optimisation strategy put into place in 2003 had radically altered the 
composition of the customer base. The primary difference between 2002 and 2003 was 
the reduction in the number of low-value customers and the consequential number of 
tickets opened by this group. In 2002, there were 212 supported customers, but through 
elimination support for low-value/low-volume users, the number of supported 
customers was down to 174 in 2003 (Table IX). 














Status 

2003 support tickets by source Total Closed Open 2003 ADO 2002 ADO 
E-mail 617 596 21 15 17 
(%) 20 

Other : 8 8 1 14 
(%) 0 

Phone 2,495 2,433 62 5 10 
(%) : 80 

Web 16 16 27 21 
(%) 1 

Grand total 3,136 3,053 83 7 12 

97 3 
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Figure 3. 

Fuzzy control system to 
determine information 
channel strategy 


Table VII. 

Number of tickets opened 
and average days open 
by information channel 
(2003) 
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Table IX. 

Comparison of number of 
support tickets and 
customers by strategic 
value 











2003 2002 
SVI Tickets % Customers % Tickets % Customers % 
0 19 1 2 1 24 1 1 0 
1 518 17 97 56 804 26 132 62 
2 263 8 12 7 245 8 19 9 
3 230 7 21 12 172 6 20 9 
4 277 9 10 6 233 7 9 4 
5 198 6 5 3 122 4 4 2 
6 692 2 13 7 529 17 11 5 
7 227 7 9 5 261 8 10 5 
8 252 8 2 1 265 9 3 1 
9 146 5 1 l 101 3 1 0 
10 314 10 2 1 355 11 2 1 
Grand total 3,136 174 3,111 212 


As a result of the training programme for medium-value customers, there was actually 
an increase in the number of tickets opened by this group. Anecdotally, the team seems 
to think that the increased focus on the mid-value group has actually encouraged them 
to use the software more and begin to do more complex things. As an important 
strategic group (with the potential of being high-value customers of the future), 
management was pleased with this increased usage despite additional costs. 

The business goals of the business process redesign efforts were to: ` 


e Increase productivity — essentially “do more with less”. 
* Decrease costs — decrease the cost of servicing customers. 
* Improve morale of the support staff. 


The optimisation strategy based on the fuzzy control system allowed the support team 
to focus on answering more queries from the medium-value customers and reduce the 
overall ADO. As such, productivity was clearly increased and cost reduced as the same 
number of tickets could now be answered with less staff. With these measurable and 
demonstrable results, the team could not help but to feel proud of their 
accomplishments. A fourth benefit, although not quantitatively measured, was 
improved customer satisfaction associated with improved service, priority handling 
and focus on mid-level customers. 


Implications and further research 

The literature has long shown that information-seeking is situation-dependent and 
individualised. Several factors such as access to information, trust of source and the 
quantity and quality of information received can influence information-seeking 
choices. With a micro-study of a single organisation it is impossible to draw general 
conclusions about the nature of information channel usage. However, there were 
several interesting observations that warrant further investigation: 


(1) There is evidence that the use of synchronous communications such as the 
telephone resulted in better service than asynchronous methods such as e-mail 
and the web. It would be interesting to compare several multi-channel support 
operations across industries to see if this is a universal phenomenon. 





(2) Geographic proximity and a common language may be a driver of channel 
choice. That is, the non-English speaking and customers outside the UK were 
more likely to use e-mail and the web. Whereas UK customers and those with 
closer strategic relationships were more apt to using the phone. Again, it would 
be interesting to determine if foreign customers are inadvertently 
disadvantaged through the use of asynchronous channels: It would also be 
interesting to understand if the use of asynchronous channels is due to 
language difficulties or geographic (time-zone related). 

Familiarity — many researchers have noted the social nature of information 
seeking. In this case, familiarity could be an influencing factor in the service 
received. Further research would be necessary to determine, in general, how 
much of an advantage familiarity gives certain information seekers in a 
commercial environment. 


6} 


~ 


The purpose of this part of the study was to validate the use of fuzzy information and 
fuzzy control systems to develop information strategies on a micro-level. With the 
recent proliferation and commercialisation of information channels — such as the web, 
e-mail, digital TV and wireless — coupled with increased globalisation, it will be 
increasingly important to understand aggregate drivers of information channel choice. 


Note 


1. Fuzzy control systems are an engineering method whereby response rules are based on 
approximate input values (Boyd, 2004b, p. 86). 
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Abstract 

Purpose — To map UK biomedical research by analysing biomedical publications from authors with 
UK institutional affiliation and indexed in Science Citation Index (SCI) and Social Sciences Citation 
Index (SSC). 

Design/methodology/approach — Bibliometric methods to assess the volume of research 
published, its impact and sources of funding of biomedical research in the UK are used. The 
analyses also include an examination of national and international collaboration, leading regions and 
institutions (by volume of output), types of research carried out and its potential impact factor. This 
was done for all of biomedicine and 32 selected sub-fields. The data used span 12 years, allowing 
changes and developments over time to be tracked. 

Findings — The UK’s position as the second largest producer of biomedical research is under threat 
from Japan and Germany and other countries with traditionally weaker biomedical research base. 
Strength in malaria and asthma research and relative weakness in surgery and renal medicine is notable. 
The profile of UK biomedical research has changed significantly in the period analysed, with a doubling 
of the level of international collaboration, a significant increase in basic research papers and an increase in 
the potential impact of UK publications. A relative decrease of acknowledgement of UK Government 
funding was noted, as were increased acknowledgements to UK not-for-profit and international 
organisations. 

Practical implications — Bibliometric analyses can provide reliable tools in mapping the development 
of scholarly disciplines which can be of use, as demonstrated in this paper, in research policy, as well as 
in domain analysis in information science, library collection development or publishing. 
Originality/value — Apart from policy applications, bibliometric research of this type can provide 
valuable information about changes in the patterns of scholarly communication within a domain 
(areas of interest in sociology of science and information science) and inform collection development 
policies in libraries and information centres (by describing literatures: ageing and obsolescence, 
volume and impact). 


Keywords Biotechnology, Research results, United Kingdom 
Paper type Research paper 


Introduction 

UK biomedical research{1] is growing fast. It is a vast and vital sphere of science 
requiring high levels of funding. A strong research base is expensive and the government, 
not-for-profit organisations and pharmaceutical companies — among others — provide the 





All analyses in this paper come from a research project funded by the Wellcome Trust and 
reported in Webster Lewison and Rowlands (2003). This paper was written during the author’s 
employment at the Centre for Information Behaviour and Evaluation of Research, City 
University, London EC1V 0HB 








money to support research activity. What analytical tools do they have that can help them 
decide which areas need investment and what is value for money? And what is available 
for administrative bodies managing biomedicine that allows them to see the full scope of 
the research activities and forces and trends at work? 

This paper is an attempt to contribute to the discussion on evaluation of biomedical 
research. Using data from over 310,000 biomedical research papers published between 
1989 and 2000, it filters these papers into 32 disciplines within biomedicine and analyses 
their funding sources, their authorship, their places of origin and the impact they have on 
other research. It examines where the UK fits into the international biomedical research 
world — in terms of international collaboration as well as production levels. 

It relies on the analysis of published outputs of the research process (i.e. a journal 
article), considering a journal paper to be a worthwhile proxy for research activity — its 
volume and impact. 

Current discussions on the ways of evaluating productivity and impact of a 
scientific discipline focus around tensions between qualitative (e.g. peer review) and 
quantitative (bibliometrics) approaches (Martin-Sempere et al., 2002; Aksnes and Taxt, 
2004), proper calibration of research tools (e.g. composition of peer panels or use of 
inclusive bibliographic data (van Leeuwen et al., 2001; Fernandez et al., 2003; Stampfer, 
2004). The literature identifies pros and cons of both approaches and lists numerous 
concerns and solutions to methodological concerns. This paper relies solely on a 
quantitative approach largely using the comprehensive bibliographic databases listing 
published (peer-reviewed only) outputs of British scientists. 


Methodology 

The data for the analyses come either directly from SCI and SSCI (for international 
comparisons) or from Research Output Database (ROD), for the analysis of UK 
performance and funding acknowledgements. These two resources provided 
information for the analyses of volume of biomedical outputs globally, in selected 


countries and the UK’s position in the international arena. Also, detailed analyses of . 


the characteristics of UK biomedical published outputs were carried out; these included 
volume, impact {as reflected in citations counts to publishing journals), funding 
sources, leading regions and institutions, levels of intra- and inter-national 
collaboration as well as categorisation of research (from basic to clinical 
investigation). The analyses were conducted for biomedicine in general and for 32 
selected biomedical sub-fields (a list of these can be found in the Apper.dix). 

For the international comparison of UK biomedical research, papers were taken 
directly from SCI or SSCI; they were classed as biomedical if at least one author came from 
an institution with a biomedical term in its name. All other analyses were based on the 
data from ROD, which includes papers with biomedical addresses as well as all papers 
published in journals classified as biomedical, but there are no biomedical addresses in 
the institutional affiliation field (thus, ROD data will always yield more papers). 

Only articles, notes and reviews were extracted and papers were considered to be 
from the UK if at least one author came from an institution with an address in the UK 
(England, Scotland, Wales or North Ireland}. The numbers of papers from other 
countries were identified in a similar manner. 

All numerical analyses used integer counts, i.e. if a paper was authored by authors 
from three different institutions and/or countries, each institution or country was 
counted once: thus in many cases percentages will add up to more than 100. 
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Table I. 

Journals by research level 
(RL), according to CHI 
categorisation 








Biomedical sub-fields 

Papers to be included in biomedical sub-fields were identified by retrieving papers 
published in specialist journals and those with specified title keywords. This technique 
of identifying papers in specific subjects yields better results, in terms of precision and 
recall, than simply relying on the ISI’s much contested classification of journals[2]. 
Papers for inclusion in these sub-fields are determined by using specially constructed 
search strategies or filters, developed by a bibliometrician — together with a subject 
specialist. First, they compile a sample list of papers from existing specialist journals 
or specialist departments. Then all significant words from paper titles are ranked 
according to frequency of occurrence and then scanned. A proportion of the words are 
retained if they are indicative of a paper’s relevance to a particular sub-field. When 
these retained words are used in conjunction with Boolean operators they form a filter 
which allows an extraction of relevant papers from the ISI disks. Each filter is checked 
for precision and recall and calibrated — the calibration factor is an estimate of the 
number of papers actually present in a sub-field, compared to the number identified by 
the filter. The concepts of precision and recall are fundamental in information retrieval 
and they allow the assessment of the accuracy (precision) and the comprehensiveness 
(recall) of our search strategy. 


Funding acknowledgements 

Funding acknowledgement analysis was based on information gathered in the ROD 
database. This information was recorded by physically examining research papers for 
funding acknowledgement information and coding into the ROD database. 


Research level (RL) and potential impact factor (PIC) 

Other analyses conducted in this study include an examination of the types of research 
carried out by UK biomedical researchers (RL categorisation) and the potential impact 
of their work (PIC). It has to be noted that both classifications are for journals in which 
papers are published, not individual papers, as the analyses taken down to the level of 
individual papers would be very difficult to conduct keeping in mind the volume of 
data analysed (over 350,000 papers over a 12-year period). 

All journals in ROD are categorised on a four-point scale by their research level (RL) 
from RL1 (clinical observation papers) through to RLA (basic research papers). This is a 
standard industry classification of research journals, developed by the CHI Research 
Inc. in the USA (Narin et al., 1976). Table I lists research type categories together with 
examples of journals in each category. 

The overall potential impact of a journal can be measured in several ways. In this 
study it was calculated as the mean number of citations received by papers published 
in the journal over the period of their publication year and the four following years 
(designated Cp). This is not the conventional impact factor (IF) published annually by 








RL Description Examples 

1 Clinical observation BM], J. Royal Coll. Gen. Pract., 

2 Clinical mix Gut, Lancet, Brit. J. Cin. Pharmacol, 

3 Clinical investigation Clin. Science, Diabetes, Eur. J. Cancer 

4 Basic research J. Physiol, Phil Trans. Roy. Soc. London B. 





+: 








the Institute for Scientific Information (for definition and usage see: 
www.isinet.com/essays/journalcitationreports/7.html/), but this approach has a 
notable advantage in that it covers a longer time period, which normally includes 
the peak year for citations to a research paper. 

Mean impact factors can change with time. Highly-cited, well-recognised journals 
tend to expand — their PIC will rise — while those that are less-cited may contract, 
merge or even close. 

For practical purposes, journals are grouped into four different potential impact 
categories (PICs), according to Co-4 value, with the highest quality papers in PIC4, and 
the lowest in PIC1. These, including the critical Cọ4 values, are shown in Table I. 
Normally, we expect about 10 per cent of journals to fall into the top (PICA) category, 20 
per cent into PIC3, 30 per cent into PIC2 category and 40 per cent into the lowest-impact 
(PICI) category. UK biomedical research fits well within this pattern. 


Findings 

Biomedical publications — international trends and the position of the UK 
Biomedicine is one of the most heavily-researched areas of science. Nearly half of all 
papers listed in the Science Citation Index (SCI) between 1989 and 2000 came from 
within this field, compared to a 1999 figure of 15 per cent from physics and 13 per cent 
from chemistry (Science and Engineering Indicators, 2002). In total, over 3,500,000 
biomedical articles, notes and reviews[3] were published worldwide over this period. In 
the UK, 55 per cent of all science publications are biomedical. 

While the worldwide proportion of science papers made up by biomedicine has 
remained steady at around 48 per cent of all papers produced every year for the past 12 
years, its representation in the scientific output of individual selected countries has 
been less stable, as Figure 1 shows. The US, The Netherlands, Canada and Germany 
have all increased their output of biomedical papers compared to other areas of science, 
while in Sweden and the UK publication of biomedical papers has relatively decreased. 

These changes may have arisen for a number of reasons. In the US, for example, the 
rise reflects a shift in the research environment, connected to funding. In the 1990s, the 
budgets of the National Institutes of Health and other funders of biomedical research 
increased significantly (Nature, 2001). At the same time, the US saw a decline in the 
budgets of agencies connected to the physical and engineering sciences. 

The US was responsible for just under half (41 per cent — 1,250,212 papers) of 
worldwide biomedical research over the past 12 years and its output continues to rise. 


PIC Co4 values Examples % UK" 


1 Below 6 Age and Aging, Brit. Dent. J., J. Epidem. & Comm. 39.6 
Hlth 

2 From 6 to 11 Int. Arch. Allergy Immunol., Anesth. Analg., 29.8 
Neurosci. Lett., J. Urol 

3 From 11 to 20 FEBS Leti, J. Invest. Dermatol, Eur. J. Biochem., 20.8 
Biochem. J. 

4 20 and above J. Biol. Chem., Blood, J. Immunol, Proc. Nat. Acad. 98 
Sci, Lancet 


Note: *Based on the analysis of all UK ROD papers between 1989 and 2000 
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Table H. 

Classification of journals 
by potential impact 
category (PIC) 
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Figure 1. 

Biomedical outputs of 

` eight leading OECD 
countries as a percentage 
of all their SCI outputs, 
1989-2000 


Table HI. 

Biomedical outputs of 
eight leading OECD 
countries, SCI data, 
1989-2000 and average 
annual percentage 
growth (AAPG) 
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But year on year (as shown in Table II, all of the eight leading countries have been 
producing more papers. Germany’s production of papers has shown the sharpest 
average annual growth rate of 4.6 per cent while the US’s growth was the slowest at 1.4 
per cent. One result of this is that the total global share produced by the US is falling — 
from 42 per cent in 1989 to just under 39 per cent in 2000. 

The UK has been the second most prolific producer over the whole period, 
responsible for around a tenth of global biomedical papers each year (or 311,684 papers 
over the 12-year period analysed here). But its outputs of biomedicine relative to other 
areas of science has fallen slightly, and since 2000 Japan — with its faster average 
annual growth rate in biomedicine — has replaced the UK as the second most prolific 
producer of biomedical papers. 


BIOM World US UK Je DE CA NL SE AU 


1989 219,829 92,917 21,937 17,567 14,899 10455 6068 6,8331 5,239 
1990 224,758 95,220 22,750 18276 15,021 10,782 6256 6456 5,228 
1991 229,171 97,520 23,148 19464 15,242 11,223 6463 6318 5,407 
1992 241,501 101574 24541 21,558 16515 12129 7,124 6,504. 5,783 
1993 242,088 102,515 24,936 22,244 16388 12107 7,688 6507 6,014 
1994 254,719 105,581 26559 24,157 18022 12646 8042 6892 6,413 
1995 ' 260,804 108,684 26,962 24527 18792 12,634 8596 - 7,118 6814 
1996 265,432 107,579 27,577 25,712 20,068 12,999 8482 7,387 6,973 
1997 269,798 108,144 27,136 26,545 21,700 12801 9168 7,595 7,157 
1998 281,709 110,582 28,215 29,019 23,930 13,010 9336 7762 7,638 
1999 279,615 108,680 28,633 28,372 23,818 13,675 9284 7,792 7,788 
2000 285,222 111,216 29,290 29,486 24,608 13330 9342 7,614 7,859 
Total 3,054,646 1,250,212 311,684 286,927 229,003 147,741 95,799 84,276 78,313 
AAPG 23 14 2.1 4 4.6 18 3.4 19 3.9 
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There also have been significant changes in the biomedical productivity of nations 
outside the top eight. For instance, Russia almost halved its annual production over the 
period, with a resulting dramatic drop in its worldwide biomedical share (from over 
6,000 to around 2,900 papers annually). This massive drop could be partly attributed to 
the dramatic decreases of research funding in Russia. Between 1990 and 1995 we notice 
the decline in the numbers of researchers, down from 1 million in 1990 to 0.52 million in 
1995, of whom half are thought to be working in other sectors of the economy while 
many of the best have gone abroad. The federal R&D budget has declined from US$10 
billion to US$2.45 billion over the same period, and the number of science doctorates 
awarded has gone down from 29,000 in 1992 to 14,000 in 1995 (Science, 1996). 
Elsewhere, South Korea saw a marked production increase, with its share of the world 
output rising from 0.1 per cent in 1989, to 1.4 per cent in 2000. China, Taiwan and Brazil 
more than doubled their share of biomedical papers. 

It is difficult to know if these increases in paper production and participation in 
world biomedicine reflect actual increases in research output or if they simply reflect a 
shift in the practices of researchers: from publishing in local journals, not covered by 
the SCI, to publishing in international journals[4]. 

There seems to be a correlation between a country’s production of biomedical 
papers and its gross domestic product (GDP). Figure 2 shows a plot of mean numbers 
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Figure 2. 
Correlation between 
countries’ biomedical 


outputs and their GDP 
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of biomedical papers produced by countries between 1998 and 2000 against their GDP 
(expressed in billions of $US) in the year 2000 (The Economist, 2003). The correlation is 
statistically significant (r = 0.961 at 0.01 level). 


Thirty-two biomedical sub-fields 

“Biomedicine” as an umbrella term covers a wide and diverse spread of research. Some 
fields — genetics or endocrinology, for example — are far more researched than others. 
And just as national rates of biomedical production shift with time, so too does the 
amount of papers published in each field. Every country — including the UK ~ has 
unique areas of relative commitment where the amount of papers produced is relatively 
higher than the average proportion of biomedical papers they produce as a whole. It is 
important to note that the 32 sub-fields analysed here do not provide an exhaustive 
classification of the entire sphere of biomedicine. They are selected examples of 
research areas, which vary in size — from motor neurone disease with only 290 papers 
published annually, to infectious disease, with over 35,000 papers — and scope — from 
basic research to clinical reports. 

In every sub-field analysed in this study, the number of papers produced has risen 
over the past 12 years. Table IV shows the average annual growth in biomedical paper 
numbers from each sub-field in the world and the UK, and their mean growth over the 
period. 

We see that biomedical engineering, motor neurone disease, gerontology and 
multiple sclerosis are the areas showing the fastest worldwide growth. Globally, these 
fields grew on average by at least 5 per cent every year. At the other end of the scale, 
the sub-fields of immunology and allergy, obstetrics and gynaecology, haematology 
and veterinary medicine are expanding more slowly — with average annual growth 
rate of less than 1 per cent. 

In many fields, UK biomedical research has followed the global growth trends. In 
the UK — just as worldwide — the fields of biomedical engineering, motor neurone 
disease and gerontology have grown at an average annual rate of 5 per cent or over. 
But there are some notable differences. Stroke research has a much faster average 
growth rate in the UK (over 5 per cent annually) than it does worldwide. While the 
average annual number of UK papers published in gastroenterology and renal 
medicine has gone down, it has increased at the world level. 

Another noteworthy difference is the UK’s commitment to asthma and malaria 
research. The UK outputs in these disciplines contribute over 17 per cent each to world 
production. This is almost twice as much as the UK’s commitment to biomedical 
research overall (10 per cent of world production). 


Focus in different sub-fields 

Globally, infectious disease research papers make up the largest proportion (14 per 
cent) of all the biomedical papers produced over the last 12 years. But in the UK, the 
proportion of research into infectious disease is slightly higher at 15 per cent of total 
biomedical paper output. Figure 3 shows the difference between the UK and the world 
in the proportion of papers produced in each sub-field. Tropical medicine, arthritis and 
neonatology and paediatrics are notable areas in which the proportional output is 
greater in the UK than the world, while in oncology, surgery and cardiology the 
opposite is the case. 


X 








Code Sub-field World (n) AAPG (World) UK (m) AAPG (UK) UK % 


ASTHM Asthma 19,295 43 3,456 3.1 179 
MALAR Malaria 12,959 19 2,241 45 173 
TROPM Tropical medicine 55,104 15 8,066 2.6 146 
MULSC Multiple sclerosis 10,250 5.3 1,358 -49 13.2 
MONED Motor neurone disease 3,265 5.9 431 55 13.2 
DENTA Dental research 49,545 24 6,484 19 13.1 
ARTHR Arthritis and rheumatism 81,190 2.6 10,586 1.6 13.0 
RESPI Respiratory medicine 156,927 2.2 17,980 16 115 
CHILD Neonatology and paediatrics 192,236 24 21,799 21 11.3 
MENTH Mental health 170,299 29 18,752 3.3 11.0 
OTORH Otorhinolaryngology 63,846 28 6,966 1.0 10.9 
DERMA Dermatology and venereology 110,184 13 11,966 0.5 10.9 
INFEC Infectious diseases 438,432 21 46,769 17 10.7 
OBSGY Obstetrics and gynaecology 183,918 0.9 19,584 11 10.6 
OPTHT Ophthalmology 81,868 2.0 8,696 21 10.6 
DIABE Diabetes 56,180 24 5,814 15 10.3 
VETER Veterinary medicine 150,287 03 15,550 12 10.3 
GERON Gerontology 100,410 5.0 10,205 5.0 10.2 
IMMAL Immunology and allergy 276,971 0.8 27,341 0.9 9.9 
GENET Genetics 408,534 3.8 40,129 4.1 9.8 
HAEMA Haematology 207,405 0.7 19,504 0.4 9.4 
GASTR Gastroenterology 243,480 1.7 22,884 -0.6 9.4 
AIDSR AIDS research 48,853 1.8 4,479 1.9 9.2 
ENDOC Endocrinology 406,120 ll 37,020 0.5 9.1 
ONCOL Oncology 365,445 3.0 32,709 13 9.0 
NEUSC Neuroscience 310,863 17 27,290 2.2 88 
CARDI Cardiology 352,726 13 30,773 17 8.7 
BIENG Biomedical engineering 41,092 6.3 3,576 6.7 8.7 
STROK Stroke research 31,417 4.8 2,657 7.6 85 
RENAL Renal medicine 87,031 13 7,187 -0.6 8.3 
TROVE Tropical veterinary medicine 27,044 2.4 2,233 3.1 83 
SURGE Surgery 186,072 3.0 14,276 0.5 77. 


Table V shows the proportion of biomedical papers produced per sub-field in different 
countries thus highlighting areas of their relative strengths and weaknesses. Figures on 
the black background show the sub-fields where the proportion of papers produced by 
a country is significantly higher than that of the world, and correspondingly, figures on 
the grey background show sub-fields where the proportion of papers produced in a 
country was significantly lower than that of the world (one is used as a base number). 
The last column shows the mean percentage of papers in each sub-field produced 
worldwide. 

The analysis also has shown that, over the period, the proportion of research papers 
which focused on biomedical engineering, genetics, stroke, gerontology, mental health 
and motor neurone disease rose in all eight countries and the proportional amount of 
publication outputs in cardiology, endocrinology, haematology, immunclogy and renal 
medicine fell. 
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Table IV. 

Total numbers of papers 
in the SCI for 32 
biomedical sub-fields 
(and for mental health, 
also in the Social Sciences 
Research Index) from the 
world and UK, average 
annual percentage 
growth, 1989-2000 and 
mean UK share of world 
research outputs 
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Figure 3. 

World and UK outputs in 
32 sub-fields as a 
percentage of world and 
UK biomedical outputs, 
1989-2000 
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Characteristics of UK biomedical outputs 

Analysis of ROD yielded 355,188 biomedical papers published by UK authors[5]. The 
annual number of papers produced grew on average by around 2.3 per cent each year, 
taking total production levels from 24,141 papers in 1989 to 33,972 in 2000, The most 
dynamic growth in British biomedicine took place in the early 1990s. After this time, 
growth slowed down, and the numbers of papers slightly declined in 1997 (see Figure 4). 

Other trends in UK biomedicine include growth in the number of papers with 
multiple authorships, multiple addresses and international authorships. For instance, 
the percentage of papers with five or more authors has grown by 18 per cent (to 38 per 
cent of all papers) from 1989 to 2000, while the number of papers with a single author is 
down to 10 per cent (from 16 per cent in 1989). This tendency towards collaboration 
applies to institutions as well as authors. A large portion of papers (87 per cent) still 
come from single institutions (as compared with 53 per cent in 1989. And the yearly 
proportion of papers with five or more institution addresses has more than quadrupled 
since 1989 — from 2 per cent to 9 per cent of the total. 

One factor that has accompanied this rise in addresses is the rise in non-UK 
co-authorship of papers. Papers co-written with a non-UK author doubled over the 
period — from 16 per cent in 1989 to 32 per cent in 2000. The USA is still the UK’s 
leading partner in biomedical research (share of co-authored papers more than doubled 
— from 5 per cent to nearly 11 per cent between 1989 and 2000), but links with the EU 
countries are rising even faster. Nearly 16 per cent of all papers produced in the UK in 
2000 involved collaboration with one or more of the EU countries, compared to just 
over 6 per cent in 1989. In all, UK scientists co-authored biomedical papers with 
colleagues from 190 different countries. Figure 5 shows the mean percentages of UK 
biomedical papers co-authored with leading countries between 1989 and 2000. 


Leading UK regions and institutions 

In order to find out about published outputs from UK regions, the country was split 
into the four main regions and publication counts were plotted against population and 
GDP figures (GDP figures for year 1994). We can see that GDP relates more closely to 
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Sub-field AU CA DE JP NL SE UK US World 


MALAR 247 046 067 037 130 089 171 085 04 
MULSC 077 128 4116 O61 125 140 12% 103 0.34 
MONED 063 156 097 148 080 109 129 098 011 


biomedical output than population (in England, Northern Ireland and Wales, at least). 
However, Scotland has a much higher biomedical output than would be expected from 
either its GDP or its population, compared to the rest of the UK. It has less than 9 per 
cent of the UKs total GDP, but publishes around 14 per cent of biomedical papers 
(Table VD. 

For a more detailed breakdown of biomedical production in the UK, the country was 
divided into 124 different postcode areas and these, in turn, were ranked in order of the 
proportion of the total papers they produced. London West Central produced 7.5 per 
cent of all UK biomedical papers (all together London produced 36 per cent of all 
published outputs). Around 5 per cent of papers came from CB (Cambridge), OX 
(Oxford) and W (London West) postcode areas each. Table VII shows leading 
institutions in the leading postcode areas. 


Type of UK biomedical research (from basic to clinical) 
All biomedical research can be placed on a continuum from basic to clinical. It is 
important that each national system achieves an appropriate balance of research. One 
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Table V. 

Eight OECD countries’ 
mean commitment to 32 
biomedical sub-fields, 
1989-2000 
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Table VI. Region Population (M) GDP (£bn) ROD 1989-2000 Mean Papers/M _ Papers/£bn 
Populations, gross re tig want on ay aN SR ee ge ee 
domestic product and England - 48.71 483.4 299,618 24,968 513 51.7 
numbers of biomedical N. Ireland 1.64 13.2 6,971 581 354 44.0 
papers published in Scotland 5.13 50 50,264 4,189 817 83.8 
England, Northern Ireland, Wales | 291 238 13,562 1,130 388 47.5 


Scotland and Wales Total UK 58.39 578.7 355,188 29,599 507 BL. 
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Postcode Post town Institution Size code 


WC London West Central 7.5% 
University College London © 
London School of Hygiene and Tropical Medicine 
Imperial Cancer Research Fund 
Institute of Child Health 
King’s College 
Institute of Neurology 
Great Ormond Street Hospital 
National Hospital af Neurology and Neurosurgery 
Birkbeck College 
CB Cambridge o 
University of Cambridge 
Laboratory of Molecular Biology (MRC) 
Addenbrooke’s Hospital 
Babraham Institute (BBSRC) 


o 


n badi 
poo eo cao aQ goenean 


Ox Oxford h 
University of Oxford 
Jobn Radcliffe Hospital 
Radcliffe Infirmary 
Medical Research Council (several) 
Churchill Hospital = 
WwW London (West) 54% 
Hammersmith Hospital/Royal Postgraduate Medical 
School ((CSTM) 
St Mary’s Hospital (CSTM) S 
University College London K 
Charing Cross Hospital (CSTM) F 
University College and Middlesex School of Medicine ç 
Imperial College of Science, Technology and 
Medicine 
King’s College 
Notes: *between 500 and 1,000 papers; "between 1,000 and 2,000 papers; “between 2,000 and 5,000 
papers; “between 5,000 and 10,000 papers; fover 10,000 papers 


way of monitoring what this balance is, is to classify each biomedical paper published 
into an appropriate category. 

One method of looking at the balance of research is to examine the publications in 
any given journal and then to categorise the journal according to the predominant type 
of research it carries. The journal classification used in this report has been devised by 
CHI Research Inc. All journals are assigned a research level based on a four-point scale: 
RLI — clinical observation, RL2 — clinical mix, RL3 — clinical investigation and RLA — 
basic research. Figure 6 also shows the classification of ROD papers for the three 
four-year periods studied. 

UK researchers published around 30 per cent of all their publications in basic 
journals (RLA) while only some 20 per cent were in journals classed as clinical (RL1). 
Moreover, we notice a further increase of basic research papers (and a slight increase in 
clinical observation) and decrease of clinical observation and clinical mix publications 


(RL3 and RL2). 
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Table VII. 

Postcode areas with most 
biomedical outputs in the 
UK, 1989-2000 
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Figure 6. 
Distribution of UK 
biomedical papers by 
research level (RL), 
1989-1992, 1993-1996, 
1997-2000 





























{not assigned) Clinical observation Clinical mix (RL2) Clinical investigation Basic research (RL4) 
{i 


The balance of research varies around the country. Of the postcode areas that published 
at least 200 papers per year, Norwich (NR), the site of the BBSRC’s John Innes Centre and 
the University of East Anglia, had the highest level of basic research (63 per cent of all 
papers). London EC (the location of St Bartholomew’s Hospital) and London SE (home of 
King’s, Guy’s and St Thomas's) had the lowest proportions of basic research. 

Figure 7 shows the distribution of UK biomedical papers in 32 sub-fields by research 
level. Predictably, neuroscience and genetics had a majority of papers published in basic 
journals (60 per cent of papers in RLA journals), while surgery had most papers in journals 
classed as clinical (nearly 85 per cent in RL1 category). Of course, the shift towards more 
basic research may in part result from rapid increases in published outputs in sub-fields 
which normally produce more basic research. For instance, neuroscience grows annually 
by 2.2 per cent and genetics by 4.1 per cent (compare: Table IV). 


Impact of UK biomedical papers 

One of the major benefits of bibliometric research is the fact that its techniques allow 
an assessment of the quality of scholarly endeavour. The concept of quality.can be seen 
as highly subjective. But quantitative methods within bibliometrics can help assess the 
impact that a paper may have on the discipline — and thus provide a certain measure of 
the paper’s quality. One way of doing so is to measure the influence of the journal 
which publishes the paper — the journal’s impact factor. 

The impact of UK biomedical papers appears to have risen over this study’s 12-year 
period. The proportion of ROD papers that fall into PIC4, the highest category, has 
more than doubled, rising from 5.9 per cent to 13.2 per cent. The percentage of papers 
in the PIC3 and PIC2 categories also rose causing a significant drop (above 11 per cent) 
in the share of the papers in the lowest impact category, PIC1. It is not immediately 
clear if these changes are a result of an actual improved quality in UK biomedical 
papers, or if they are due to other factors. For example, our research demonstrated that 
the output of UK biomedical researchers has shifted from applied clinical (RL2 and 
RL3) towards basic.research (RLA). Basic research is generally more highly cited, 
meaning that basic journals tend to have a higher PIC score than clinical journals. This 
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would suggest the rise in PIC is not due to improved output impact but simply a shift 
from clinical to more basic research. 

There seems to be a strong positive correlation between numbers of funding bodies 
that provide support for a paper and its PIC value (Figure 8). Papers with multiple 
funding sources are more likely to be published in higher-impact journals. This may be 
because a research proposal has to go through a rigorous assessment every time 
funding is sought — thus sharpening its focus and raising its impact. 

A positive correlation between numbers of authors[6] and numbers of funding 
acknowledgements per paper and its PIC value was also noted, while, interestingly, 
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Figure 7. 

Classification of UK 
biomedical research in 32 
sub-fields by research 
level (RL), 1989-2000 
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Figure 8. 

Effect of funding, 
authorship and addresses 
on the PIC of UK 
biomedical journals 
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inter-institutional collaboration had a negative impact[7]. The following formula 
(drawn from regression analysis of PIC) and Figure 8 illustrate this: 


PIC = 0.056A + 0.000A”.0,055D+0.005D?-+ 0.104F-.009F” 
where A= Authors; D = Addresses and F = Funding acknowledgements 


Funding of biomedical research in the UK 
Information about funding of UK biomedical research for this report was gathered 
through analysis of acknowledgements present in research papers, notes and reviews 
and indexed in SCI and SSCJ[8}. It is a standard practice to acknowledge any form of 
support given in aid of research activity and any subsequent publication of findings; 
thus, these data can be used accurately to establish names of funding bodies as well as 
type of support given (but not its monetary value). 

Several important trends emerge from the analysis of funding E A EE 
on UK biomedical papers. The shifts in patterns of acknowledgements tell a story of 
shifting funding patterns, collaboration among scientists and areas of research 
attracting more/less attention from funding bodies. 

In the 12 years under consideration in this study, nearly two-thirds of papers in 
ROD acknowledged at least one funder. There seems to be a correlation between 
acknowledgements and numbers of authors and addresses on papers. For instance, 57 
per cent of papers with one institutional address acknowledge no funding. Similarly, 70 
per cent of papers with no funding.acknowledgements had only three or fewer authors. 
Also, together with the increase in numbers of authors or addresses per paper, we note 
an increase in percentages of papers with acknowledgements (¢.g. over 85 per cent of 
papers with six or more authors acknowledge funding, while only 35 per cent of papers 
with one author do so). 

For the purpose of this analysis, all funders of UK biomedical research 
acknowledged on biomedical papers were divided into the following categories: 


* UK Government (UK Govt) — including agencies, departments and local 
authorities. 





* Private UK not-for-profit organisations (UK PNP) — i.e. charities, foundations, 
hospital trustees, mixed (i.e. academic), and other not-for-profit bodies. 


* Industry — UK and non-UK, including pharmaceutical, non-pharmaceutical and 
biotech companies, and veterinary practices. 


* Foreign (non-UK) governments and not-for-profit organisations — eg. US 
National Institutes of Health or the Ford Foundation. 


* International organisations — e.g. the European Commission, the World Health 
Organisation. 


Figure 9 shows the proportion of papers supported by the different funding sectors in 
three four-year time periods. Overall, there are some 8,000 unique funding bodies 
which receive acknowledgements on UK biomedical papers, but only some 50 of them 
are acknowledged on 1,000 or more papers, 440 are acknowledged in between 100 and 
999 papers and a staggering 2,000 funding organisations received only one 
acknowledgement. 

There has been a drop in acknowledgements to UK Government funders, while UK 
private not-for-profit (most notably Wellcome Trust), foreign (Deutsche 
Forschungsgemeinschaft — DFG, National Institutes of Health ~— NIH) and 
international (European Union, European Molecular Biology Organisation — EMBO) 
funders increased their share of acknowledgements. 

The Medical research Council (MRC) is the single most frequently acknowledged 
funder of UK biomedical research. The Wellcome Trust (WEL) is second (and it has 
substantially increased numbers of received acknowledgements in the past 12 years) 
and the Biotechnology and Biological Sciences Research Council (BBC) is the third 
most frequently acknowledged supporting body. 
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Figure 9. 

Percentages of UK 
biomedical papers 
acknowledging support 
from five main sectors and 
with no funding 
acknowledged 
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Figure 10. 

Percentage of UK ROD 
papers acknowledging 
support from main 
funders, 1989-1992, 
1993-1996 and 1997-2000 
(key to codes used in 
Table AH in the 
Appendix) 


Figure 10 shows the top 25 leading funders and the percentage of ROD papers in which 
they. were given acknowledgements of funding in each of the three four-year periods 
between 1989 and 2000. We can see that there have also been large increases in 
numbers of acknowledgements to the European Union (CEC) and the EPSRC (EPR). 
Smaller rises in support have come from the British Heart Foundation (BHF) and the 
Royal Society (RYS). è 


Funding of biomedical sub-fields 

There is a big variation in the levels of funding acknowledgements between different 
biomedical sub-fields analysed in this paper. Among the sub-fields with most funding 
acknowledgements we find, unsurprisingly, genetics and neuroscience. However, more 
surprisingly, the three sub-fields with the highest percentages of acknowledged 
funders were malaria, tropical medicine and tropical veterinary medicine. In the case of 
malaria nearly each paper acknowledged, on average, two funders, And indeed, 
malaria papers were among the top four sub-fields acknowledging funding from UK 
not-for-profit (4th), industry (2nd), foreign (8rd) and international (1st) funders. 
Figure 11 shows the percentages of funding acknowledgements in 32 sub-fields by 
major funding sectors, 1989-2000. 


Conclusions 

This paper aimed to show the current state of biomedical research in the UK: what is 
its international position (in terms of volume of published outputs), who funds it, what 
are the characteristics of biomedical research in the UK (major players, type of research 
performed and its impact and levels of national and international collaboration). The 
paper concentrated on a survey of biomedicine as a whole as well as discussing all the 
above issues in relation to the 32 biomedical sub-fields. The study looked at 12 years of 
British biomedical outputs, from 1989 to 2000. 
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Percentage of funding 
acknowledgements in 32 
sub-fields by major 
funding sectors, 1989-2000 









DENTA FS SS ST 
SURE ER Vee 
DAT a N 


0 25 50 75 100 125 150 175 200 





Major findings from this survey indicate that the UK’s place as the second largest 
producer of biomedical research is threatened by Japan and Germany which have had 
average annual percentage growth of 4 and 4.6 per cent respectively as opposed to 2.1 
per cent from the UK and 2.3 per cent for the world. In fact, in 2000, Japan, for the first 
time, produced more biomedical papers than the UK. It is also worth noting the 
relatively large increases in published outputs from countries which traditionally were 
not considered important producers of biomedical research. For instance, South Korea, 
China and Taiwan show annual increases in excess of 12 per cent. 

The UK’s participation in world research in 32 sub-fields varies from sub-field to 
sub-field: while the UK produces over 17 per cent of world outputs in asthma and 
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malaria, its outputs in surgery fall below 8 per cent (10 per cent is an average for all 
biomedicine). Also, we note that gastroenterology and renal medicine research are the 
only two areas of decline in volume of outputs (their average annual percentage growth 
is -0.6 per cent respectively). 

Domestically, within the UK, we notice several changes in the outlook of biomedical 
research. We observe a steady growth in volume of publishing (which slows down 
after 1996) and we note an increase in international collaboration (especially with other 
countries of the European Union), as well as an increase in numbers of co-authored 
papers (both with other UK and international partners). While England, in absolute 
terms, is the biggest producer of biomedical papers in the UK, Scotland leads in the 
ratio of research papers to GDP and to population. Within regions, London produces 36 
per cent of all biomedical published outputs, while within individual postcode areas — 
London West Central (WC), Oxford (OX) and Cambridge (CB) are the biggest producers 
of research papers in biomedicine. 

The type of biomedical research carried out in the UK is also shifting — we note an 
increase in the proportion of papers published in basic science journals (RLA) as well as 
a slight increase in papers in clinical observation journals (RL1) and a decrease of 
papers in both clinical mix and clinical investigation journals (RL2 and RL3). 

Overall, there is a modest increase in the proportion of UK biomedical papers which 
acknowledge funding (from 62 per cent in the early 1990s to 66 per cent in the late 
1990s). There are, however, big variations in the distribution of these 
acknowledgements among main categories of funding. For instance, the share of 
acknowledgement to the UK Government is decreasing while all other sectors show 
increases (most notably foreign institutions — by some 10 per cent over 12 years). 
While the MRC is still the most frequently acknowledged funder of research (with 
nearly 16,000 papers between 1997 and 2000), the Wellcome Trust is closing the gap 
quickly (with some 14,000 papers in the same time period). Significant differences in 
levels of acknowledgements were noticed for different biomedical sub-fields. Malaria 
research papers show the highest percentages of acknowledgements (with on average 
two acknowledgements on each malaria paper) while surgery had, on average, one 
acknowledgement in every four papers. 

Together with this shift towards more basic, multi-authored and multi-funded 
work, we note an increase in the impact (PIC) of UK biomedical papers (there is a 10 per 
cent increase in proportion of PIC4 papers between 1989 and 2000). 

The quantitative method of bibliometrics was used to conduct all analyses in this 
study. This method, if used with care, can yield a wealth of information which may aid 
biomedical researchers who want to trace the volume and impact of published outputs 
or find:out who funds research within their disciplines; policy makers interested in the 
position of a discipline in the national and international arenas; and funders evaluating 
“returns on investment” or as a tool in aiding their grant decisions. 

Apart from policy applications, bibliometric research of this type can provide 
valuable information about changes in the patterns of scholarly communication within 
a domain (areas of interest in sociology of science and information science) and inform 
collection development policies in libraries and information centres (by describing 
literatures: ageing and obsolescence, volume and impact). 
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Notes 


1. Biomedical research can be defined as that undertaken to gain knowledge and 
understanding of basic biological processes and causes of diseases and having direct or 
indirect medical benefits. For the purpose of this paper, it encompasses basic biology 
(excluding botany and ecology), clinical medicine and biochemistry. It also includes animal 
health and social sciences allied to medicine. Biomedical research considered in this report 
was carried out in a range of institutional settings, including medical centres, hospitals, 
universities, research centres and institutes as well aspharmaceutical and biotech 
companies. : 

2. For a more detailed description of the method see Lewison (1999). For discussion of other 
methods see Glanzel and Shubert (2003); DeBruin and Moed (1993). 

3. Only articles, notes and reviews were analysed in this report. They are referred to in 
different places as papers, publications or outputs. 

4. For instance, in China, a special unit within the Chinese Academy of Sciences was created to 
evaluate published outputs of Chinese scientists, stressing the importance of publishing in 
international journals (Wu et al, 2003). 

. Note that the numbers of ROD papers are higher than those extracted from SCI using 
biomedical address filter. This is because ROD, apart from biomedical address filter papers, 
contains papers which were published in biomedical journals, but came from addresses 
without biomedical component. 

6. A recent study also concluded that highly cited papers are “typically authored by a large 

number of scientists” (Aksens, 2003). 


. Similar correlation was found for Austrian biomedical research (Lewison, 2003). 


. The notion of using acknowledgements of funding on research papers as a way of assessing 
the success of research funding agencies is put forward in Jeschin et al. (1995). 
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Appendix 
AIDSR AIDS research 
ARTHR Arthritis and 


ASTHM 


BIENG 


CARDI 


CHILD 


DENTA 


DERMA 


DIABE 


rheumatism research 


Asthma research 


Biomedical 


engineering 


Cardiology 


Neonatology and 
paediatrics 


Dental research 


Dermatology and 
venereology 


Diabetes research 





Research on HIV, the virus which cases AIDS, includes the 
investigation of viral structure, replication, pathogenesis, 
epidemiology, and development of HIV vaccines and antiviral 
drugs. Other research involves the diagnosis of HIV and 
AIDS-related conditions, treatment and care of patients with HIV 
and AIDS and the efiect of anti-retroviral drugs and educational 
campaigns on the incidence of HIV and AIDS 


Research on the musculoskeletal system in health and disease 
including inflammatory arthritis, connective tissue diseases, 
degenerative joint diseases, arthritis associated with infection, 
non-articular rheumatism and bone disease but excluding 
orthopaedics. Study of the aetiology, epidemiology, treatment and 
basic biochemical, immunological, physiological and genetic 
mechanisms relevant to these diseases 


Research into causation, basic mechanisms, manifestations and 
treatment of asthma and closely related diseases such as allergic 
rhinitis; research inte the delivery of care to those with asthma and 
non-pharmacological management such as patient education and 
self-management 


It integrates physical, mathematical and life sciences and 
engineering principles for the study of biological systems in health 
and disease and for the application of technology to improve 
quality of life. It creates knowledge from the molecular to organ 
systems levels, develops materials, devices, systems, technology 
management, and methods for assessment and evaluation of 
technology, for the prevention, diagnosis, and treatment of disease 
and for patient care and rehabilitation 


Study of the heart and its functions in health and disease, including 
“invasive cardiology”, which is the practice of diagnostic and . 
therapeutic cardiac procedures that involve entry into the heart or 
central circulation 


Research into the prevalence, pathogenesis, prevention, treatment 
and outcome of health problems affecting new-born infants and 
children up to and including adolescence 


The branch of medical science that investigates the aetiology, 
epidemiology, diagnosis, prevention and management of 
craniofacial and oral disease. A multi-disciplinary approach is 
utilized to integrate oral and general health and the application of 
common methodologies and interventions 


Dermatology involves the epidemiology, prevention, diagnosis and 
treatment of structural, inflammatory, neoplastic and infective 
diseases of the skin. It also encompasses those internal diseases 
that have skin signs, but excludes burns and scalds. Venereology 
similarly covers all sexually acquired disorders that are manifest in 
signs and symptoms at the site of inoculation but excludes AIDS, 
HIV and hepatitis 


No definition provided 
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Table AI. 


ENDOC 


GASTR 


GENET 


GERON 


INFEC 


MALAR 


Endocrinology 


Gastroenterology 


Genetics 


Gerontology 


Haematology 


Immunology and 
allergy 


Infectious diseases 


Malaria research 











Study of the basic physiological mechanisms of mammalian 
endocrine systems including secretory regulation and peripheral 
actions. It also includes the study of the causation, occurrence, 
presentation, diagnosis and treatment of disorders associated with 
or caused by disturbances of endocrine homeostatic systems. It 
includes diabetes, but mot metabolic disorders arising through 
non-endocrine mechanisms 


Research into the cause, basic mechanisms, diagnosis and 
treatment of gastrointestinal disorders. This includes epithelial 
biology research, studies on the enteric nervous system and its 
neural connections to the central neural nervous system, and the 
immune system of the intestine. Also included are studies of the 
relationships between the luminal contents of the gut, especially 
bacteria, and gastrointestinal function. Additional research in 
technologies required to study the gastrointestinal tract, especially 
endoscopic and radiologic, is included, e.g. fibreoptic endoscopy, 
ultrasound, MR scanning 

The study of heredity and transmission of biological variation. It 
includes studies of genetic disease, of mapping genes on 
chromosomes, mutation, characterisation of DNA sequences and 
the control of gene expression as relevant to human and animal 
health. It also includes cross-disciplines including behavioural 
genetics, pharmacogenetics and immunogenetics 

Research addressing the aetiology of the ageing process in human 
beings, the treatment of ageing-related diseases, and societal 
measures to maintain the physical and mental health and wellbeing 
of elderly persons 

Research into the physiology of blood and the causes, diagnosis 
and treatment of blood disorders mediated by its constituent cells 
or plasma proteins 

Allergology comprises clinical studies and immunological research 
in mammals to study the body’s responses to normally innocuous 
substances that result in inflammation, including genetic 
predisposing factors and the effects of some infectious agents. 
Immunology is the study of the body’s response to infectious 
agents and inappropriate responses to other agents, including 
allergens, self-antigens and transplanted material 

Research into diseases of vertebrates whose cause can be traced to 
the presence or replication of an organism, the genetic material of 
which is not present in the chromosomes of most of the host 
population. It excludes disease of plants and invertebrates. It also 
excludes CJD, BSE and similar diseases, but includes retroviruses, 
whose genomes can be found integrated into the chromosomes of 
some members of the host population and all parasites 


Research relevant to malaria disease in animals and humans; 

the parasite causing malaria (species of the genus Plasmodium); 
and mosquito carriers. The full spectrum of research from basic 
cellular, biochemical and molecular studies, through clinical, 
epidemiological and field studies, vaccine and drug development 
studies, as well as operational research into the delivery of malaria 
treatment and control measures by health services and other 
organisations 


(continued) 





MENTH 


MONED 


MULSC 


NEUSC 


OBSGY 


ONCOL 


OPHTH 


OTORH 











Mental health 


Motor neurone 
disease research 


Multiple sclerosis 
research 


Neuroscience 


Obstetrics and 
gynaecology 


Cancer research 


Ophthalmology 


Otorhinolaryngology 





Research into the causation, occurrence, presentation, diagnosis, 
treatments and care cf disorders affecting the mental health of their 
sufferers in childhood, adolescence, adulthood and older age. The 
major diagnostic categories included are: anxiety disorders, bipolar 
disorders, conduct disorders, the dementias, depression, eating 
disorders, obsessive compulsive disorders, phobias, schizophrenia 
and suicide. Substance abuse is excluded. Mental retardation and 
handicap are not specifically included 


Research on causation, molecular and biochemical mechanisms of 
neurodegeneration and regeneration in amyotrophic lateral 
sclerosis and other motcr neuron diseases; research into markers of 
diagnosis and disease progression, pharmacological treatment and 
multidisciplinary care management 


Research into the cause, basic mechanisms, diagnosis and 
treatment of multiple sclerosis. This includes glial cell research and 
studies on astrocytes and the blood brain barrier relevant to MS. 
Also included are anima] models of MS such as experimental 
autoimmune encephalomyelitis and models induced by Theiler’s or 
Semliki Forest virus. It includes relevant research on myelin and 
myelin protein genes 


Excludes mental health, psychology and psychiatry and clinical 
studies on patients 


Research into the cause, basic mechanisms, diagnosis and 
treatment of all maternal aspects of obstetrics and disorders of the 
female genital tract including the vulva, cervix, uterus, ovary, 
Fallopian tubes anc the placenta. Such disorders include both 
non-neoplastic and benign and malignant neoplastic conditions 
including infertility, endometriosis, smooth muscle cell disorders 
and intraepithelial neoplasia 


The study and treatment of cancer or tumours. This incorporates 
academic oncology and clinical oncology. Academic oncology is 
aimed at identifying the causative agents or underlying genetic 
defects producing cancer and at developing these discoveries into 
effective drugs and other therapies. Clinical oncology is oriented 
towards the treatment, management and cure of cancer 


The medical specialty relating to the treatment of diseases and 
disorders of the eye. It includes studies of the structure and 
function of the eye, the diagnosis and treatment of diseases, injuries 
and defects of the eye and corrective surgery to remove cataracts, 
improve vision and remove diseased eye tissue 


Otology includes congenital and acquired diseases, including 
cancer, of the ear and posterior skull base, their causes, diagnosis, 
medical and surgical treatment, and patient rehabilitation. 
Rhinology similarly covers the nose, sinuses and anterior skull 
base, and includes allergy and facial plastics. Laryngology covers 
the upper aero-digestive tract, the head and neck, excluding the 
pau at eyes. Work on animals is excluded except for research on 
air cells 
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Table AI. 





RENAL 


RESPI 


STROK 


SURGE 


TROPM 


TROVE 


VETER 


Renal medicine 


Respiratory medicine 


Stroke research 


Surgery 


Tropical medicine 


Tropical veterinary 
medicine 


Veterinary medicine 





Research into the causes, genetic and environmental, of diseases of 
the kidneys and urinary tract, including the information generated 
by national gene and clinical data banks, end into animal models of 
human renal diseases. Studies of the epidemiology of renal 
diseases, their early detection and the benefits and 
cost-effectiveness of population screening. Results of controlled 
trials of treatments including dialysis and transplantation and of 
the audit of clinical practice through Renal Registries. Studies of 
the complications of renal failure, notably heart disease, and of 
their prevention and treatment. Basic science research relevant to 
renal disease 


Research into causation, occurrence, clinical features, 
pathophysiology and treatment of diseases affecting upper and 
lower airways and lung parenchyma 


Research on the genetics, epidemiology, diagnosis, imaging, 
causes, prevention and treatment of cerebrovascular disease. 
Includes stroke, transient ischaemic attack, cerebral venous 
thrombosis, cerebral infarction, cerebral (including subarachnoid) 
haemorrhage and vascular (multi-infarct) dementia. It also includes 
animal models of cerebral ischaemia or haemorrhage 


Research on all aspects of surgery, including neurosurgery; 
orthopaedic and trauma, ENT, cardiac, thoracic, vascular, 
gastrointestinal, urological, transplant, plastic, maxillo-facial, 
paediatric, gynaecological, ophthalmological and general 
surgery. It includes surgical techniques including laparoscopy, 
and related research in musculoskeletal tissues, transplantation 
and gastrointestinal surgery-related issues and vascular 
tissues 


All types of research relevant to health and disease in Africa, 
Southern and Central America, and Asia, including studies of the 
major infectious diseases, of disease carriers (e.g. insects), 
nutritional disorders, snakebites and non-communicable diseases. 
The search strategy does not specifically target non-communicable 
diseases, but will capture some research in this area through the 
use of broad key words linked to health and disease in the tropics 
and through retrieving all publications in the major tropical 
medicine journals 


Biomedical research involving animals aimed at the improvement 
of animal or human health (except where the animal is being used 
purely as a model for biomedical processes, including diseases, in 
humans) in Africa, Southern and Central America and Asia, 
including studies on infectious diseases, vectors, nutritional 
disorders, welfare and health-related aspects of animal 
production 


Biomedical research involving animals aimed at the 
improvement of animal or human health (except where the 
animal is being used purely as a model for biomedical processes, 
including diseases, in humans) including studies on infectious 
diseases, vectors, nutritional disorders, welfare and health in 
domesticated animals (mammals, birds, fish, and invertebrates) 
and in wild animals, where they impact on domestic animals or 
human health 
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Abstract 


Purpose — To judge the quality of health information provided to the users of the NHS Direct online 
enquiry service. 

Design/methodology/approach — An examination of available online tools was necessary to 
enable the development of a quality framework appropriate for the study. The checklist developed 
from this process provided a method of judging a specific web site’s quality level. Readability levels of 
web sites were measured using the Flesch-Kincaid scale. Two case studies were conducted to examine 
consistency of responses, and in order to measure user satisfaction questionnaires were distributed. 
Findings — Results from the checklist indicated that the majority of health information sent on to 
users of the service was of adequate or excellent quality. The readability levels of information 
promoted by the NHS Direct Online enquiry service are at levels higher than is recommended in the 
literature. The case studies implied that the criteria used by the NHS in composing responses to 
enquiries is not always consistent and may need streamlining. Despite this, 97 per cent of respondents 
were happy with the information sent to them. A combination of user satisfaction and referral to 
adequate or excellent quality health web sites suggests that the NHS is providing a good quality 
information service to the British public. 

Research limitations/implications — It is difficult to draw reliable conclusions from the small 
sample size employed in this study. It is also unfortunate that the respondents could not be 
interviewed or observed as they submitted their enquiry and while they examined web pages. The 
checklist developed to measure web site quality could, in itself, bring limitations, no weighting factors 
were employed when comparing criteria and the researcher felt that some of the criteria were hard to 
judge in practice. 

Practical implications — The NHS need to undertake some streamlining of their e-mail enquiry 
service so that all the web sites it promotes contain health information that is at a good or excellent 


quality level. 
Originality/value — Examination of a practical health service which purports to help improve the 
quality of NHS health provision. 
Emerald Keywords National Health Service, Health education, Information management, Quality, 
mer Electronic mail, Internet 
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a variety of forms of medical web sites, including online self-help groups and sites set 
up by individuals or organisations. Some of these sites will also offer an enquiry 
service where the public can ask questions of health professionals by e-mail. In 
November 2001 the NHS launched an online enquiry service of their own, which 
enables internet users to submit enquiries on health related issues to which they then 
receive a reply via e-mail. 

One of the major concerns with regard to such web sites is the quality of the 
information that is presented. Health information of a poor quality can easily lead to 
individuals inadvertently causing harm to themselves or their family. The accurate 
assessment of health information available online therefore becomes an important 
issue. Tools have been developed to assess this information but these tools are varied 
in both content and standard. There is no one accepted method to apply to health 
information on the internet. 

This study addressed issues of quality by undertaking an evaluation of the NHS 
Direct online e-mail enquiry service. The service is accessed via the NHS home page 
(www.nhsdirect.nhs.uk/). A link in the left-hand column titled “Send us your enquiry” 
takes you to a disclaimer, which lists a number of terms and conditions applying to the 
site. These include the fact that information can only be given about named conditions, 
and that it is not suitable if you: 


* have symptoms that have not been diagnosed; 
* are unsure about what condition you have; and 
* feel the treatment you are receiving may not be working. 
After clicking the “Accept” button at the bottom of this page you are taken to the 


“Health Enquiry Questionnaire”. The form asks for a range of information and lists a 
number of questions which the user must answer, such as: 


* their e-mail address; 

* who the enquiry is about; and 

* what information are they looking for and at what level. 
After submission a user can expect to receive an e-mail within five working days. The 
response will include instructions and a link to another web page, which will give the 


user links to a number of health web sites containing information specific to their 
needs. 


Aims and objectives 
The aim of this study is to judge the quality of the health information provided to 
internet users of the NHS Direct online e-mail enquiry service. 
The specific objectives are listed as follows: 
* To develop an appropriate checklist for judging the quality of health web sites. 
* To use the checklist to judge the information retrieved from the NHS. 
* To measure users’ satisfaction with the information they receive and the enquiry 
service as a whole. 
« To measure readability statistics of the web pages. 
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* To use case studies to examine the similarity of URLs receeved by enquirers, 

| with regards to requests for information on glandular fever, and to examine 

Gifferences in requests for basic information against requests for advanced 
information. 


Literature review 
The literature on health information on the internet covers a number of themes. Those 
discussed here cover quality of health information and health information literacy. 


Quality of health information on the internet 

Many studies have been undertaken to examine the quality of health information on 
the internet with regards to a particular disease. These include one by Impicciatore et al. 
(1997), which looks at the management of fever in children at home, and Abbott (2000) 
who covers web pages relating to the MMR vaccine. All agree that there is a great deal 
of misleading or inaccurate data available on web sites, with the coverage of key 
information being inconsistent and poor. Gilliam et al. (2003) state, “coverage of key 
information for patients is usually poor”. It is also noted that high reading levels are 
required to comprehend this information. 

Many articles examine instruments and criteria for judging quality. Early articles 
on this start by listing general points for consideration when trying to assess the 
quality of information on the internet. Wyatt (1997) examines accuracy of material, 
timeliness, readability and accessibility of information on the web site. In an article by 
Silberg et al. (1997) criteria listed include the effective use of technology including 
feedback, linking mechanisms and web site content. They believe that issues of content 
quality are similar to those for any written information. They talk of a brand identity, 
picked up on later by Delamothe (2000), which establishes a level of trust with readers, 
and they list quality indicators which exist for printed material and should also apply 
in the digital arena — authorship, references, disclosure and currency. 

Kim et al. (1999, p. 649) reviewed the published criteria for evaluating health sites. 
They noted that many authors agree on key criteria and efforts to develop a consensus 
tool would be helpful: “The next step is to identify and assess a clear, simple set of 
consensus criteria that the general public can understand and use”. 

It is evident from later articles that this set of widely accepted core standards did 
not develop. Jadad and Gagliardi (1998) published results of a study to identify 
instruments used to rate health web sites and concluded that most of the instruments 
were underdeveloped. They raised the question of whether they should even exist in 
the first place. In a later study (Gagliardi and Jadad, 2002, p. 572) they performed an 
update to the original. Many tools featured in the original study were no longer 
functioning and none of the remaining six sites appeared to have been validated. Of the 
51 identified after 1998, only five gave information by which they could be evaluated. 
They noted that “several organisations, including government and non-profit entities, 
have developed criteria by which to organise and identify valid health information” yet 
were still uncertain as to whether it is even necessary to assess the quality of health 
information on the web or if it is achievable. 

Although a number of articles look at criteria and rating instruments, few examine 
whether consumers actually use these instruments or what criteria they apply when 
appraising health information. A recent study by Eysenbach and Kohler (2002, p. 576) 








describes the techniques used by consumers to search for and judge health information 
on the internet. In focus groups and interviews participants said they looked at factors 
such as source, design and ease of use when assessing the credibility of a web site. Yet 
results from observational studies showed otherwise. Subjects could only correctly 
identify which web sites they gained information from or who held themselves 
responsible for these sites 20.9 per cent of the time. In all cases this was not because the 
sites did not disclose such information but because users did not pay attention to this 
information. The authors concluded that further observational studies are required in 
this area as “people in a real setting with a greater stake in the outcome ... might care 
more about quality and therefore more actively look for markers of quality”. 

Discussions on the alternatives to rating instruments have appeared in the literature 
more recently. Eysenbach is particularly vocal with his support of “trustmark” 
techniques to allow for a collaborative approach to evaluation, so different rating 
agencies can use similar standards with a common meta-data language (Eysenbach 
and Diepgen, 1998a). Often, the concept of rating a site is disputed because the content 
frequently changes. Eysenbach points out that this is valid only if the unit of 
evaluation is the information on the server, not if the trustmark refers to the 
information system or provider (Eysenbach et al, 2001). Recommendation should be 
based on knowing the process behind the information production. 


Health literacy on the internet 

The reading ability of the average adult falls below the level of educational material, 
forms and documents commonly used in the health field (Rudd et al, 1999). This idea is 
echoed in studies on the readability levels of medical material on the internet. The 
American Medical Association Ad Hoc Committee on Health Literacy (1999) 
recognised that almost half the US population has deficiencies in reading skills. An 
OECD report entitled Literacy in the Information Age (OECD, 2000) looked at 20 
nations worldwide and declared that low literacy skills are to be found even among the 
most economically advanced societies. Between a quarter and three-quarters of adults 
failed to attain a literacy level of three, which is the suitable minimum for coping with 
the demands of everyday life and work in a complex society. 

Graber et al (1999) used the Flesch score and Flesch-Kincaid[1] reading level 
formulae to examine the readability of patient education material on the internet. Their 
study showed that the majority of such material is at a higher level than is appropriate 
for most patients. Perez and Couto (2002) also employed Flesch levels (Spanish 
adaptation) when analysing health web sites. They concluded that although 
readability scores for the documents analysed were good, they were not optimum 
for users searching for health related information on the internet. 

What reading level is actually recommended for health information? Eaton and 
Holloway (1980) suggested a grade level between five and seven for paper inserts for 
packets. Determinations of reading levels are only valuable as seen in the light of their 
target audience (Rudd et al, 1999). With a target audience of internet users, Hughes 
(2001) takes account of this idea when he recommends a Flesch-Kincaid Jevel of eight, 
at the highest. He acknowledges that this is higher than some would like but defends 
his position thus: “One can safely argue that the internet does some screening of 
literacy and readers of internet content are probably more literate than the general 
population”. 
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Table I. 

Percentage scoring 
system of the Rollins 
School of Public Health 


Summary 

The quality of health information on the internet continues to be a fone of importance 
to health professionals and consumers. It is agreed that few web sites provide complete 
and accurate health information and therefore the need for a set of core standards for 
the publishing of online health information is vital. Attempts to improve the situation 
are being undertaken by organisations such as MedCERTAIN, and in time literature 
examining the results of such efforts is likely to be published. 


Methodology 
Methods used include: 


* An evaluatory checklist. 

* ' Questionnaires. 

* The Flesch-Kincaid readability scale. 
* Case studies. 


The checklist 

There are numerous evaluation tools available on the internet, which aim to help 
consumers or health workers judge the quality of health information presented on the 
internet. Relevant web sites were identified using the search engine Google. A total of 
20 of these tools were examined, with a view to implementing one in this study yet 
none were found to be useable in their existing form. Therefore, a further examination 
of currently available online tools was necessary to enable the development of a 
checklist appropriate for this study. The finalised checklist then provided a method of 
judging a specific web site’s quality level. 

The scoring of the checklist followed that of the Rollins School of Public Health 
(www.sph.emory.edu/WELLNESS/instrument3.html). In calculating a score using 
their method, “total points” are added along with “total possible points”. “Total points 
possible” is defined as the number of questions answered as either “Agree” or 
“Disagree”, multiplied by two. The “total points” score is divided by the total number 
of points possible to determine the overall percentage rating of the web site: 


Total score/total possible score = “Percentage of total points”. 


The rating is elected as excellent, adequate or poor as shown in Table I. Table II shows 
a checklist example. 


Score Rating 


At least 90 per cent of total possible points Excellent: this web site is an excellent source of 
health information. Consumers will be able to easily 
access and understand the information contained in 
this site 

At least 75 per cent of total possible points Adequate: while this web site provides relevant 
information and can be navigated without much 
trouble, it might not be the best site available 

Less than 75 per cent of total possible points Poor: validity and reliability of the information 
cannot be confirmed 
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The author is clearly stated or clearly inferred 2 
References/sources and citations are given 1 
There is disclosure about the source of funding 1 
Advertisements are clearly stated as such 0 
Contact information is given for the author or 53 
webmaster 2 
Accounżability for maintenance of the site is clearly 
given 2 
The editorial policy for material inclusion is given 2 
Totals 2 8 
Total score = 10 
Total possible points = 2 x 6= 12 
Percentage of total points = 10/12 = 83 per cent Table II. 
Rating corresponds to poor/adequate/excellent Scoring example 





In order to establish the readability level, a piece of the text from the web page in 
question was copied into a blank Microsoft Word document. The “Tools” menu in 
Word gives a “Spelling and Grammar” option, which was adjusted to show readability 
statistics and run against each section of text. If the result was level eight or below, 
then a point of two was ascribed to the page for this criterion. A level of nine or above 
gave a point of one. 


The questionnaire 

Questionnaires were designed to gauge the level of user satisfaction with the quality of 
the information sent by the NHS and the satisfaction of the service overall. A sample of 
internet users were given a sheet of instructions to follow and asked to send a request 
to the NHS on a health topic of their choice. The respondent completed the first 
questionnaire after sending their enquiry to the NHS and the second questionnaire 
after the information was received and read through. Returned questionnaires were 
coded onto a spreadsheet and analysed. The responses sent by the NHS were copied 
and returned along with the questionnaires, then examined using the checklist to 
determine a quality level of poor, adequate or excellent. 

The main areas covered by the questionnaires were as follows: 


* Basic personal details and internet usage. 

* Technical experience in submitting an enquiry. 

* Response time and retrieval of information from the internet. 

* Comprehension and perceived quality of the information retrieved. 

* Presence of health associations online and issues relevant to health information 
on the internet. 


Questions in the second questionnaire were designed to cover a number of criteria 
essential to academic considerations of quality of online health information, such as 
usability, relevance, bias, readability, authorship, currency and funding. This was 
done in order to determine if consumers took note of such issues while examining 
health information online. 





a7, 1 


54 


Readability levels 

The Flesch-Kincaid scale is used within the checklist to score the readability level of 
information presented to users. This has been used in other academic studies of a 
similar nature (Graber et al, 1999; Perez and Couto, 2002). 


Case studies 

In order to test the consistency of information presented to users, smaller case studies were 
conducted which used content analysis of the NHS replies. A number of users were asked 
to submit a request for information on glandular fever. The Uniform Resource Locators 
(URLs) sent in response were examined in an attempt to determine to what extent the NHS 
are sending links to the same information sources, in response to similar queries. 

A second case study was designed to examine if the NHS were consistent in 
matching requests for basic or advanced information with the appropriate response. 
Readability levels were measured and noted against the original request for basic or 
advanced information. Pages sent in response to the basic requests should have a 
readability level of eight or below on the Flesch-Kincaid scale. 


Sample unit 
Convenience sampling was used in this study, where respondents were recruited from 
advertisements on the JISC Health Services Research mailbase and at City University. 
Snowball recruitment was employed where possible (here an initial group of respondents 
are asked to bring others fitting the respondent criteria into the sample unit). 

There were only three essential inclusion criteria. These are the same as those that 
enable the consumer to use the service: 


(1) Being a UK resident. 
(2) Having internet access. 
(3) Having an e-mail account. 


Results 

Existing quality assessment tools 

The existing tools were examined to determine the most accepted criteria for the 
determination of a quality web site. A number of the web sites, including British 
Healthcare Internet Association (BHIA), the Centre for Health Info Quality and QUICK 
(Quality Information Checklist), give a summary of the basic issues involved and 
present the user with a list of issues (usually in point form) to consider in determining 
quality and reliability of a health web site. Figure 1 shows the criteria most commonly 
used by the web sites examined. It was from these criteria that the checklist used in 
this study was developed. 


Checklist results 
The following factors were considered when each web site was marked against the 
checklist: 
* Web page or web site? A user should not have to examine the entire site for the 
information they are looking for. Concentration was focused on the specific URL 
sent but two clicks away from the page were permitted. 











* Jf the site was accredited by a well-known organisation, such as CHIQ or HON, it 
received a “2” for the “Editorial policy/code of Eethics” criterion, even if the code 
was not visible on the page given. 


Quality ratings. A total of 106 web sites were examined and scored with the final scores 
ranging from 58 to 97. The average was 86, with a standard deviation of 7.6. This gives 
an average quality grading of “adequate” to the web pages distributed by the NHS. 

Readability scores. Of the 106 pages, 98 were graded. For technical reasons, it was 
not possible to score the other pages. Readability scores ranged from five to 12. 
Table MI shows the frequency of grades[2}. 

A total of 77 (78 per cent) of the scores were above the recommended level of eight or 
below. However, only four (13 per cent) respondents in this study indicated that the 
material was too “technical” in their final comments. Some 11 (35 per cent) had 
difficulty understanding some of the information sent to them with the remaining 20 
(65 per cent) having no difficulty understanding the information. 


The questionnaires 
Response rate. In total, 37 out of 50 questionnaires were returned. Of these, six (16 per 
cent) respondents did not receive an answer from the NHS after submitting their 


Score Quantity 


12 27 
11 16 
10 18 
9 16 


8 ll 
7 5 
6 4 
5 1 
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Figure 1. 
Quality criteria sorted by 
most common mention 


Table II. 
Readability score 
distribution 
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enquiry. All the following figures in this study are calculated from the 31 respondents 
who did so, and whose web sites could therefore be examined. 

Personal details and internet usage. Responses came from 19 females and 12 males. 
A total of 21 (68 per cent) were aged from 21 — 35, eight (42 per cent) were aged 
between 36 and 50 and two were aged 65 +. 

Of the respondents, 28 (90 per cent) used the internet on a daily basis. The 
remaining three (10 per cent) used it a couple of times a week, 23 (74 per cent) 
respondents had searched for health information prior to this study, while only 12 (89 
per cent) had heard of the NHS Direct online web site before. Of these 12, only three had 
used the e-mail service previously. 

Technical submission of enquiry. This is the process of sending the enquiry via the 
internet. Technical problems encountered could have included, for example, broken 
links or the appearance of error messages. Of the respondents, 29 (94 per cent) gave 
comments regarding the submission of their enquiry. 

A total of 12 (39 per cent) found the submission process straightforward while seven 
(23 per cent) encountered technical problems. One respondent stated: 


When submitting I was told the question could not be answered because it was about 
medication, but I ticked the box saying it was NOT about medication. 


This respondent also had another problem: 


I was told to rewrite the query twice and all checkboxes went blank so I had to fill in again. 
Got it right at the fourth attempt. 


Other respondents had similar issues and it is unlikely that they would have continued 
submission of their enquiry except for the fact that they were partaking in a formal 
study. One respondent, who had previously used the e-mail service, stated: 


I found it straightforward [technical submission of the enquiry], but the last time I used it the 
response was not helpful. 


The respondent was not happy with the response received on this occasion either and 
felt the service to be “very disappointing”. Although they were not happy with the 
information sent to them, they marked that they were only “not sure” as to whether 
they would use the service again or recommend it to others. 

Design of the form was also an issue for some respondents: 


I am not sure that the questions asked would elicit all potential data needed to answer some 
enquiries. 


With regard to online help offered on the site, the following comments were made: 
Fairly simple process but the help box was not very helpful. 


The help links, should you require them, were merely repetitions of the instructions on the 
form. 


Response time, usability and information retrieval. It took the NHS from less than 24 
hours to seven days to answer the enquiries. The average time taken was 2.5 days, 
with 21 (68 per cent) responses received within two days. The NHS aim to reply within 
five working days from receipt. If the weekend is taken into consideration then the 
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NHS has achieved this goal. (It must be noted that 16 per cent of the original number of 
respondents never received an answer.) 

Of the respondents, 25 (81 per cent) easily found the details they wanted on the web 
pages looked at, four (13 per cent) felt they were somewhat easy to find and two (6 per 
cent) had difficulties. Only two (6 per cent) respondents had technical difficulties 
accessing the web pages. It would be expected that the links sent to respondents were 
currently active yet it is impossible for this to be guaranteed due to the fluid nature of 
the internet. 

Comprehension and perceived information quality. While 24 (77 per cent) 
respondents found the information relevant to their needs only 19 (61 per cent) said 
the information answered their entire query. A total of 5 (16 per cent) found the 
information “somewhat” relevant and 11 (85 per cent) found the information 
“somewhat” answered their query, two (6 per cent) found the information irrelevant 
and one (3 per cent) felt it did not answer their query. 

Almost half of the sample (49 per cent) did not check the authorship of the 
information they looked at. Currency seems to be more of a concern to users. Only 
seven (23 per cent) respondents did not check for a date. A total of 14 (45 per cent) noted 
that the information was up to date and nine (29 per cent) were not sure. Only one 
respondent felt they did not have current information (and yet they indicated they were 
happy with the information received and would use the service again). 

The majority of respondents answered the question “Do you have any comments 
about the NHS Direct online enquiry service web site or the information that was sent 
to your” There were a number of straightforward comments along with those that were 
qualified by another factor: 


This is brilliant. I feel confident with the information given. 


It worked very well and the web sites recommended to me had good and comprehensive 
information but I have to say that my enquiry was probably not particularly difficult to 
answer. 


Other positive comments offered suggestions for improvement: 


Although the advice came from different medically orientated organisations, the articles did 
not promote use of specific equipment or medical centres as a follow-up. 


A summary of key points in the mail sent to me rather than just the URLs would have been 
useful. 


Most respondents who indicated disappointment with an aspect of the system had an 
overall positive response to the NHS Direct online enquiry service. Only two comments 
were blatantly negative and both shared a disappointment with the quality of the 
information received: 


Very disappointing. Sent me very general information, which did not answer my query at all. 


I was generally rather disappointed with the response I received. 


Content. Some respondents found the information sent to them to be too advanced 
while others found it too basic. Also, confusion existed in reference to the nature of the 
information available: 
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When I requested the information I thought I would receive a personalised response and not 
just links to web sites. 


Still confused about basic and advanced information. The help menu should be more specific 
as to what the difference is. 


One respondent gave a comment indicating that they did not understand the nature of 
the service, even though they stated they had read the terms and conditions: 


I did find it fairly restrictive that you could only enquire about pre-diagnosed conditions, I 
thought you would be able to give symptoms and ask for an official diagnosis. 


Case studies 

Between August 2002 and March 2003 five respondents requested information about 
the condition of glandular fever. One did not receive a reply. A total of 18 different 
URLs were sent in response to the remaining enquiries. Of these, six (33 per cent) 
overlapped. This shows a relatively low level of consistency in response to queries of a 
similar nature. 

Of the respondents, seven (23 per cent) requested basic information, four (13 per 
cent) requested advanced and 20 (64 per cent) requested both. All URLs sent in 
response to the advanced queries were at an advanced readability level of nine or 
above. A total of 14 (58 per cent) of the URLs sent in response to the basic queries were 
at a level of nine or above, while only ten (42 per cent) were at eight or below. 

From these URLs it was possible to compare one basic against one advanced 
response. Each received five URLs and three (60 per cent) of these were identical. These 
three all had a readability level above eight. Only one web page from the basic 
response had a score at eight or below. 


Discussion 

Limitations of the study 

It is difficult to draw reliable conclusions from the small sample size employed in this 
study. It is also unfortunate that the respondents could not be interviewed or observed 
as they submitted their enquiry and while they examined web pages. It is important to 
pay attention to what users do as well as what they say. Studies have indicated that 
although users do one thing, their recollection of that event may differ from what really 
happened (Nielsen, 2001). Therefore problems experienced by users may be forgotten 
or misrepresented when completing the questionnaires. 

The checklist developed to measure web site quality could in itself bring limitations. 
For example, differing definitions of authorship and currency affect the results from 
the checklist along with respondent responses on questionnaires. With both these 
criteria a distinct definition may be necessary to clarify what is being measured. Does 
the date refer to when the web site was last updated or when the material itself was 
written? Is the “author” an individual or an institution? Does confidence in the 
information presented increase because the author is an established organisation or 
because the individual has the title of “Dr” preceding their name? Such issues can 
directly affect the perceived quality of information and yet have not been accounted for 
in the checklist developed or in others examined during this study. 

No weighting factors were employed when comparing criteria and the researcher 
felt that some of the criteria were hard to judge in practice. For example, medical bias is 
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difficult for a layperson to judge. Without having conducted background research how 
can a member of the public judge if the information presented covers all options and 
opinions? Again, the examination of other available online quality rating tools 
indicated that the problem is not limited to this study alone. 


Key findings 

Quality. Only eight of the 106 web pages examined were of a poor quality, with the 
remaining (92 per cent) being adequate or excellent. This is a good result for the NHS, 
indicating that users can be satisfied that the information they are receiving is likely to 
be some of the best that is available online. Yet the NHS must aim to achieve a 100 per 
cent adequate or excellent rating on the web sites they distribute, as the information 
contained on those pages can have a direct effect on the health of their enquirers. It is 
worrying that 33 per cent of respondents would use the information to make a decision 
about their own or their families’ health yet these same respondents were not sure if 
they would discuss this with their GP. Health information from any source should be 
qualified by a professional before be enacted on. 

Web site usability directly affects the quality of a web site. Information of excellent 
quality is useless if users cannot access it. In all, 97 (92 per cent) of web pages examined 
received full points under the checks for usability, 25 (81 per cent) respondents easily 
found the details they wanted on the web pages looked at and only two (6 per cent) 
respondents had technical difficulties accessing the web pages. It is difficult to make a 
definite judgement without knowledge of the respondents’ details of access to the 
internet but, even so, such results are encouraging and indicate that the NHS is 
promoting information that is easily accessible. 


Suggestions for improvement 

A number of points resulting from this study can be examined in order to improve the 
service. One respondent made a comment referring to the restrictive form of the 
service. The NHS Direct online web site gives a clear indication that they are restricted 
to providing information on diagnosed conditions only, echoing Eysenbach and 
Diepgen’s (1998b) belief that cyberdocs should limit their advice to general health 
queries. Perhaps they could also provide a detailed explanation as to what the service 
does not do and why. 

Although some respondents ask for basic and advanced information they did not 
appear to realise that the advanced information could be too technical for them and 
complained of its difficulty. Again, the NHS needs to be clearer about certain aspects of 
the service being provided. A detailed explanation of the exact nature of the service 
should be written in very clear and simple terms and made easily accessible within the 
web site, in order to prevent misunderstanding. 

In response to the enquiries on glandular fever, 18 different URLs were sent, and of 
these, only one-third of the responses were the same. This shows a relatively low level 
of consistency in response to queries of a similar nature. Perhaps the criteria used by 
the NHS in composing responses needs to be streamlined or procedures put in place to 
ensure a higher level of consistency. Consistency was also an issue with regards to site 
security. On some occasions the researcher was able to access a respondent’s answer 
by using a link provided by the respondent, via e-mail. On other occasions this method 
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of accessing the answer did not work. Further study would be needed to indicate why 
these issues of inconsistency occurred. 

A number of respondents experienced technical problems when submitting their 
enquiry and it is likely that some continued only because they were part of a formal 
study. Submission problems may have stemmed from site design and further study 
could indicate the origin of such problems. 

Of the respondents, 16 per cent of the original number never received an answer 
from the NHS after submitting their enquiry. The NHS must give a higher response 
rate than the 84 per cent from this study if it is to provide a truly quality service. 

One respondent made a comment on the quality of their response stating: 


Iimight have got just the same sort of level of information typing my inquiry into a search 
engine. 
The NHS need to ensure their service is not simply the duplication of the search 
engines that currently exist on the internet. The adequate to excellent quality of the 
majority of sites examined would indicate that the NHS service does act as a filter to 
good quality health information. 


Readability 

Of the respondents, 65 per cent had no difficulty understanding the information sent to 
them, despite 78 per cent of the scores being above the recommended level of eight or 
below. 

From the URLs sent in response to queries for glandular fever, it was noted that 
readability levels were not significantly different for basic or advanced responses. This 
suggests that the NHS must streamline their responses to match all the specifics of the 
queries put to them. 


Conclusion 

One specific aim of the NHS Direct is to improve the quality of its health provision 
(NHS Direct, 2001). Results from the checklist developed for this study indicated that 
the majority of health information sent on to users of the NHS Direct online e-mail 
enquiry service is of adequate or excellent quality. Even so, the NHS must try to 
improve this so that 100 per cent of the web sites it promotes contain health 
information that is at a good or excellent quality level. 

The enquiry service works within the confines of what already exists on the internet 
and responses to users will reflect this. The readability levels of information promoted 
by the NHS service are at levels higher than is recommended in the literature yet this is 
a feature common to most health sites on the internet. The NHS can do little about this, 
but they can ensure that the specificity of sites sent to users match the queries they are 
sent out in response to. Only two of the respondents had difficulty in finding 
information on the web pages sent to them, which indicates that specificity at this level 
exists. Yet results from this study show that there is some question as to whether sites 
are correlated against the users’ choice to receive basic or advanced information. If the 
NHS is to follow recommendations by others working in the field of online health 
information, then those asking for information at a basic level must receive information ` 
graded at or below eight on the Flesch-Kincaid scale, so that the potential for 
misunderstanding health information is kept to a minimum. 





Of the respondents, 97 per cent were happy with the information sent to them. This 
indicates that the NHS Direct online enquiry service is satisfying the information needs 
of the majority of its users. A combination of user satisfaction with referral to adequate 
or excellent quality health information suggests that the NHS is providing a good 
quality information service to the English public. Hopefully, in time, this service can be 
extended to cover the whole of the UK so that the NHS will be able to achieve its aim of 
improving the quality of health provision to all its users. 


Notes 


1. This system rates text on a US grade school level. For example, a score of 8.0 indicates that 
an eighth grade student can understand the document. 


2. Figures after-the decimal were ignored. For example, a grading of 8.9 would be considered to 
be within a readability grade of 8. 
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Abstract 

Purpose — To help determine the extent to which “unique” informational content is available on 
personal home pages (PHP) on the world wide web (WWW). 

Design/methodology/approach — The informational content of PHPs is manually compared with 
the informational content of non-PHPs in the field of professional football in England. This produces 
instances of information which is available on the PHPs but not on the non-PHPs, A search is then 
carried out to determine whether these pieces of information are available elsewhere on the web. 


Findings — There are notable quantities of information which are only available on PHPs. There are 
also instances where certain information will be available on PHPs before it is available on non-PHPs. 
In addition, the degree to which information on PHPs is correct is also likely to be quite high. These 
facts in conjunction suggest that PHPs as a whole make a notable contribution to the informational 
content of the WWW. 

Research limitations/implications — The sample data are limited in size and scope. 

Practical implications — PHP visibility and utilization may increase. i 
Originality/value — Provides a methodology for informational comparisons of web pages. 
Keywords Worldwide web, Electronic media, Football 


Paper type Conceptual paper 


Introduction 
Since its relatively recent advent, the world wide web has already changed in many 
surprising ways. In the beginning, it was primarily a means of making information 
easily available and accessible. Since then, it has developed many other uses, as a real 
time communication medium (video-conferencing on the web), as a sales point 
(e-commerce) and even as the world’s largest electronic playground (online gaming and 
gambling). However, the web’s first role has helped it achieve its status as one of the 
warld’s largest information resources. It is in this role that we wish to investigate it, 
evaluating the personal home page (PHP) as a possibly underrated, but thoroughly 
useful, information resource. Aslib Proceedings: New Information 
It has been said one way of identifying whether a web page is a PHP is simply by Perspectives 
determining whether it has been created in bad taste, with lots of colour clashes, a Pens ara 
self-promoting style of writing (often entirely in the first person), with numerous © Emerald Group Publishing Limite 
grammatical mistakes, an unusually long URL, dead links, poor graphics and an pot 10.1108/00012530510579075 
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amateur photograph (Iron, 2003). The authors believe that this view is shared by many 
academics and information scientists. 

Even though the PHP made its appearance very early on in the development of the 
world wide web — as early as 1993 (Koch, 1993), it has not caught the eye of the 
research community quite as much as other aspects of the web. The negative 
perceptions described above could very well have played an active role in discouraging 
researchers from looking into the PHP in more detail. Whatever the reason, the authors 
believe that PHPs deserve to be the subject of robust research. 


Aims and objectives 

This project has two aims. First, to test the hypothesis that PHPs make a notable 
contribution to the overall information content of the WWW by providing instances of 
“original information’[1]. Second, to provide data which will facilitate a better 
understanding of what PHPs on the web have to offer in terms of information vis-a-vis 
more traditional information sources on the web[2]. The research is exploratory in 
nature, with the intention to provide a basis for a larger scale project that will 
(eventually) be able to determine with more certainty what the informational benefits 
of PHPs truly are, and how they can be best utilised to improve the accessibility of 
information for academics, businesses and personal users alike. 


The scope 

The paper in particular covers the information content of PHPs, and compares it with 
that of non-PHP web sites. As such this is primarily a web-based work where the 
majority of all the information examined emanates from the WWW. 

A case study approach is adopted with the chosen subject area for the research 
being the field of professional football in England. The reason for the selection is that 
the field generates a good deal of public interest and data and possesses a number of 
information communities and a range of information players. In some ways this field 
can be considered to be “saturated” in information terms. There are several 
communication mediums which give constant daily coverage of the current state of 
affairs. These include newspapers (The Guardian, The Sun), football magazines 
(FourFourTwo, Shoot Monthly), television (Sky Sports 1, dedicated sports news 
channels like Sky Sports News), web sites (Soccer.net, www.soccer.net, 
Football365.com, wwwfootball365.co.uk) and radio (Radio5Live and dedicated sports 
stations like TalkSport). 

On the internet, there are 2,197 web sites listed in the professional football 
categories of the Open Directory web site, while Yahoo! has 746. On television in the 
UK, there are a total of 13 channels[3] which have shown football matches from this 
field. In addition there are two (Sky Sports News, Eurosport News) dedicated sports 
news channels, one of which (Sky Sports News) averages approximately 18 hours of 
football news per day[4]. On the radio in the UK, there are on average, 30 programmes 
per week about football[5]. Finally, there are at least three football magazines available 
in ordinary newsagents in London{[6]. 

Further evidence of the popularity of professional football in England is provided 
by the “financial attention” it receives. The Premiership is arguably the “richest” 
football league in the world, accounting for 25 per cent of the European football 
industry (Harding, 2003). In the list of the 20 richest (in terms of annual turnover) 
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football clubs in the world (2002/2003 season), seven are in the Premiership, with five of 
those in the top ten (Roberts et al, 2001). 

Meanwhile, in terms of getting a variety of “informational environments”, the 
chosen field has four divisions, each with a varying degree of public interest and 
attention. At the one end is the top division of professional football in England (the 
Premiership), where the volume of information is at its peak and at the other end of the 
spectrum is Division 3 (now also known as the Coca-Cola Football League 2), which 
receives noticeably less coverage. More figures which can be used as a gauge to 
indicate the level of public interest in the leagues, as well as some informational 
coverage (Stat-mail, 2004), can be found in Table L 


Methodologies 

This is a project in which the emphasis is on the informational content of PHPs. As far 
as we can determine at the time of writing, such a project has never before been carried 
out. As such, there are no established methodologies available for carrying out this 
research. However, there have been a number of projects, which have focused on other 
aspects of PHPs. Therefore, in deciding an appropriate methodology for this project, 
the methodologies used in PHP related projects, have been studied. These will be 
looked at presently. 

- Dominick (1999) examined the PHP to find out what web authors were doing with 
the opportunity given to them to become mass communicators. PHPs were looked at in 
order to identify their most popular features, but also in terms of self-presentation. A 
total of 319 personal home pages were examined. The page collection was carried out 
by using the Yahoo! web site, where there was a sub-category entitled “People” within 
“Entertainment” (this sub-category no longer exists). The “People” sub-category had 
links to PHPs, and Dominick selected 500 of them by picking a random starting point 


and deciding a random “skip”. The sub-category. had been indexed by letter, and this - 


process was carried out for each letter (i.e. random skip for PHPs beginning with A, 


Live televised Average Web sites Web sites 
League matches? attendar.ce? at Dmoz? at Yahoo! 
Premiership 106 34,900 906 414 
Division 1 50 15,908 467 332 
Division 2 10 7,486 437 
Division 3 5,389 387 


Notes: *See, for example, SkySports.com (2004); See, for example, Stat-mail (2004); ‘Premiership 
figures at: DMOZ Sports > Soccer > UEFA > England > FA Premiership; Division 1 figures at: 
DMOZ Sports > Soccer > UEFA > England > Football League > Division 1; Division 2 
figures at: DMOZ Sports > Soccer > UEFA > England > Football League > Division 2; 
Division 3 figures at: DMOZ Sports > Soccer > UEFA > England > Football League > 
Division 3; All figures checked on 13 May 2004; {Premiership figures at: YAHOO! Directory > 
Regional > Countries > United Kingdom > Recreation and Sport > Sport > Football > 
Leagues > Premiership; Remaining division figures at: YAHOO! Directory > Regional > 
Countries > United Kingdom > Recreation and Sport > Sport > Football > Leagues > 
Nationwide Leagues; All figures checked on 27 April 2004 





Personal home 
pages 


65 


Table I. 

Gauges of public interest 
in English professional 
football leagues for the 
2003/2004 season 
(television coverage, 
football match attendance 
and internet web sites 
from the Open Directory 
and Yahoo!) 
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and another random skip for PHPs beginning with B, etc). If a site was non-English, it 
was dropped from the sample and the next one on the list was selected. Commercial 
and professional sites were excluded from the sample later on, thereby reducing the 
size of the sample from the original 500 to 319. The actual examining of PHPs was 
carried out by 16 undergraduate students. 

Similarly, Papacharissi (2002) carried out a study to understand the utility of PHPs 
for their creators. The study produced results concerning PHP motives, predictors of 
web page characteristics, authors’ unwillingness to communicate and contextual age. 
The methodology had several similarities to Dominick’s. In Papacharissi’s case, a 
thousand PHPs were randomly sampled from Yahoo! Geocities, AOL, MSN and 
Earthlink (250 from each). Sites which contained primarily commercial or professional 
content were not included in the sample. Papacharissi then e-mailed a questionnaire to 
all the authors and collected demographic data from those who replied, as well as 
information on why the web pages were created and maintained. 

Dominick’s (1999) and Papachrissi’s (2002) sampling method is used in this project’s 
methodology, in part, for selecting PHPs. Meanwhile, other relevant but unsuitable 
sampling methods for this project were those adopted by Nomura et al. (2001), who 
carry out a study on PHPs by taking them directly from university sites (i.e. by location 


rather than topic) and Déring (2002) who mentions places to look for PHPs, but once 


again not by topic. 

Dillon and Gushrowski (2000) also carried out a study on PHPs. In an attempt to 
describe the first truly unique digital genre, they selected 100 random pages from the 
PeoplePlace and Personal Pages Worldwide, and proceeded to identify the common 
elements of these sites. Elements such as counters, titles and graphics were noted for 
their frequency. Having found the most common and least common elements, they 
created two categories with four types of PHP each. The first category of PHP was the 
“common elements” category. PHPs in this category would have two, three, four or five 
of the most common elements. The second category was the “uncommon elements” 
category. This consisted of PHPs with have two, three, four or five of the least common 
elements. Sets of these sites were then shown to 57 students who ranked them from 1 to 
8 in order, based on PHP design. 

Dillon and Gushrowski’s (2000) methodology is closer to the needs of this study, as 
it could be said that the ranking constitutes a type of comparison. However, the aspects 
being examined by Dillon and Gushrowski are mostly aesthetic, and though it would 
be possible to alter the methodology to include other aspects of PHPs, there is no need 
to rank web pages for this project. 

In addition to the methodologies already looked at, there are other methodologies 
used for web page evaluation which come from the field of Human-Computer 
Interaction (HCD. Particularly well-known are the heuristic evaluation methods of 
Nielsen (1994). There are ten “usability heuristics” which can be examined and tested 
against which can provide feedback about the usability of a web page or a web site 
(Instone, 2002). By testing several web pages against the same criteria, it then becomes 
possible to make comparisons between the web sites. 

However, once again, this is effectively a case of examining the interface rather than 
the content and is therefore not suitable for the purposes of this study, where the focus 
is on the content. 


ay 











In terms of content, however, there are established evaluation criteria for 
information resources on the web (Smith, 1997; Oliver et al, 1997). With the rapid 
growth of the internet in recent years, the quality of information has been a major 
concern of librarians especially. In consequence, studies have been carried out to find 
suitable criteria for the evaluation of internet resources. Smith (1997) carried out a 
literature review in the field of internet resource evaluation and then amalgamated the 
criteria found into a “toolbox of criteria”. The criteria in general are specific to the 
needs of librarians evaluating resources such.as online journals, conference papers and 
so on. As such, they are not particularly relevant as a whole to the study presented 
here. Smith (1997) himself concludes by saying that those working with internet 
information resources should create their own list of criteria according to their needs, 
and use those, As such, this methodological tool has been made use of (see employed 
methodology). 


Employed methodology 

The methodology adopted for this study is a four-step process. The first step is finding 
web sites or web pages about professional football teams in England, ranging from the 
English Barclaycard Premiership to the Nationwide Division 3 (now known as the 
Coca-Cola Football League 2). All the teams from each division constitute the sample 
population (a total of 92 teams). 


Employed,methodology: step by step 
(1) Find PHPs and non-PHPs on QPR. 


(2) Check availability of all information on PHPs against availability of 
information on non-PHP. 


(3) Check availability of information on rest of web. 
(4) Check correctness of information found on PHPs against other sources. 
Suitable web sites for study were then identified. The criteria for a PHP to be eligible 


for selection are listed in Table II. The actual identification of the PHPs was made 
using the definition established by de Saint-Georges (1997). As such, the PHP has to: 


* claim that it is a PHP; 
* have personal information on it (such as a CV, photograph etc.); or 
* represent a person rather than a group[7]. 


A suitable search phrase (and/or combination of phrases) was run on Google, and the 
first 300 entries were checked manually for PHPs. The figure of 300 is a balance 


Main criterion Description 
Scope Must be about professional football in England 
Content Web page must be PHP 
Web page must have at least a subsection on a specific football club 
Workability Web page must be accessible at time of study 
7 The relevant sections of the web page must be written in English 
Cost Web page must have no financial costs for its use 
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Table I. 

Criteria for PHP selection 
(custom-made using 
Smith’s (1997) toolbox 
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between that which is feasible (i.e. within temporal constraints), and that which would 
be ideal. The sample was then added to by examining every link found on the already 
discovered web pages of the team in question. They were then checked both for PHPs 
and non-PHPs (non-PHP sites found had to be dedicated solely to the team in question). 
This produced several lists of sites, including a list of PHPs about a football club, and 
the other lists of non-PHPs for the same club. i 

Second, three PHPs representing each club are randomly selected and checked for 
information. The same information is then compared to the information contained in 
the appropriate non-PHP sites representing the same club. The non-PHP sites include a 
comprehensive sport/football web site (e.g. www.bbc.co.uk/football), the home page of 
the football team itself (e.g. www.gpr.co.uk), and one other non-PHP home page, 
dedicated to the football team (e.g. www. qprnet. com). 

Third, all the information found to be “ unique” is checked for on the rest of the web 
to ensure that it actually is unique. This is carried out by using terms the researcher 


believes to be relevant on a search engine (i.e. Google) and a meta-search engine (ie. ` 


Dogpile). 

Finally, a sample of non-unique information is taken from the PHPs (about 10 per 
cent of the total number of instances checked) and checked for correctness against 
other web sources. 


Sample case of employed methodology 

To further clarify the exact workings of the methodology, an example is shown. This 
regards a preliminary pilot study involving the web pages of four football clubs. This 
explanatory case uses one of these four football clubs, Queen’s Park Rangers (QPR). At 
the time of writing, QPR were in the 2nd Division of the Nationwide League (now 
known as the Coca-Cola Football League 1). 


Finding and selecting PHPs 
The initial searches were carried out using the Google search engine on the 1 May 2003. 
The exact search terms used were: i . 


+ “queens park rangers” +my[8] 


The search yielded 6,780 results of which the first 300 were examined. 

The PHP chosen for comparisons was Dave’s Unofficial QPR web site 
(www.queensparkrangersfc.com) and the non-PHPs chosen were the Official QPR 
site, QPRnet.com and the BBC QPR web page (Figures 1-4). Dave's Unofficial QPR site 
was chosen to show that there is a possibility of positive results. A total number of 26 
sites (12 PHP, 7 non-PHP, 7 unknowns) were found. At this stage, the selection of PHP 
is made for the purpose of showing simply that it is possible for the project to yield 
“positive” results (Le. that PHPs make a notable informational contribution to the e 
“footballing” on the web). 


Comparing PHPs 

Once the pages to be compared had been selected, sections (and subsections) of the 
PHPs were compared, and then given. Relative Information Content Ratings (RICR), 
which was a rating system the authors developed for the purpose. There are five 
different RICRs. They are unique, frst, better, equinferior and not applicable (see 
Table II). These RICRs apply to one section of a web site against other web sites. In 
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this case study, the RICRs will apply to one section of a PHP (Dave’s Unofficial QPR 
web site) against the equivalent sections of the selected non-PHPs. These RICRs will 
help determine what PHPs have to offer in terms of information resources in 
comparison to non-PHPs. 


PHP Relative Information Content Ratings (RICR) 

Unique (X). This is the rating which has the most significance for the evaluation of 
PHPs. When the PHP has a piece of information, or has an information subsection, 
which is not found on any of the non-PHPs it is being compared to, this is classified as 
Unique. The symbol for Unique is the letter X. 

An example is the Match Report (Current) subsection of Dave’s Unofficial QPR web 
site (DUW). This subsection can be found under the Latest heading on the site. Match 
reports in general contain a detailed commentary of what happened at the match in 
question in an essay or report layout. The DUW Match Reports contain, in addition to 
the essay style commentary, the ratings of every player, and the Man-of-the-Match (the 
player who made the greatest impact on the match). 

Of the three sites DUW is being compared to, all have some sort of Match Report. 
The Official web site of Queens Park Rangers Football Club (QPROW) has a skeleton 
report which does not contain an essay style report, but has instead only certain details 
about the game, such as the score, the players and any cautions (see Table IV - PHP 
comparisons for QPR). QPRnet.com (QNET) has Match Reports in a style very similar 
to DUW, with a detailed essay as well as a Man-of-the-Match. Finally, the BBC 
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Figure 2. 
Tooninfo (Newcastle 
United) 
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Home Page | Ciub Info | Player Info | Stadium Info | Club History | Latest from 8JP | 
Nokla Newcastle | Toon Army Songs | Mackem Jokes | Links Page | My View | Guestbook | 
Contact Me | City of Newcastle | Subbuteo | Predictiction League 

November 1, 2004 


Welcome to ToonInfo! 


Cakebratng 2 a you of bringing you the best Newcastle United 
Info on the 


Please keep returning to keep updated on the new site as I 
will be adding new sections and updating ci ones in 
preperation for the new season! Please also be aware that 
many pages may appear to be complete but are in fact 
waiting to be fine tuned in order to make them look even 
beteri 


a site dedicated to a Football Ciub! Which is why I dacided > 
publish Tooninfo, rather than the boring old stuff you can find 
anywhere | have come up with better ideas such as 'Toon 

Army Songs’, ‘Mackem Jokes! and 'Nokla Newcastle’. 


1 am confident that you will find this site of some use to 
needs, whether tt be for humour or just generally Improving 
your knowledge of NUFC and I would hope that 
recommend it w others to usa also, but plessa Enjoy! 


Ee What do you think? eee 





SPORT-QPR Section (BBC) has a detailed essay, but also a live text commentary which 
details every instance of the match. This live commentary can still be viewed after the 
match has ended. 

The Unique rating is given to this section (Match Reports — Convent) of the 
comparisons because of the player ratings that DUW provides. This is a specific piece 
of information which is only available at DUW, and not by any of the non-PHPs, and 
thus qualifies for the Unique rating. 

First (F). A First rating can be given to an instance where two sites provide the 
same information, but where the PHP provided this information before any of the 
non-PHPs. The symbol for a First rating is the letter F. 

As an example, we can use a comparison between DUW and QPROW. DUW has a 
subsection within Latest News called Newspapers. Though the link to the subsection is 
called Newspapers, the actual heading on this part of the web site is titled “What the 
Papers Say”. As expected, this subsection contains a roundup of stories found in the 
papers about QPR on days when there are such stories. Meanwhile QPROW has its 
own section called What the Papers Say which provides the same service. In this case, 
the two sites do not have a notable difference in terms of the service provided. What is 
notable though, is the fact that DUW started the service before QPROW. Therefore the 
rating given is First. Dave implemented the idea first, and as such his web site is 
“innovative”. 

Better (B). A Better rating can be given to an instance where the author believes that 
the information provided by the PHP is (for any reason) superior to that provided by 
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The data is not yet complete. In tima the whale of the 45-50 year period from 
tha mid 1958's will be ordine. 





! 
a ESTORICAL DATS, 


1970's & 1980s material copyright Kelih Tes, which sared mo a lot of keying}! 
League tablo material courtesy of and copyright {C} Russell Gerrard and the RSSSF 
7 Tei P P 
AN other material lovingly gathered over a period af 40 odd years by the webmaster 


any of the non-PHPs. When a Better rating is given, there will be an attempt to explain 
the reasoning behind the decision in the relevant table. The symbol for Better is the 
letter B. Better is really all about content, not presentation. If site X has more pictures 
than site Y, site X is better. If site Y has a history section which covers a period in 
history in more detail, it is better. If it covers a longer period in history, it is better. It 
only needs one aspect to be better to be placed in this category. The reason for this is 
because we are not trying to show that PHPs are “better” sites, just that they have some 
information which is unavailable elsewhere. 

An example of the Better rating is the Club History section within History on DUW. 
This section covers the history of the QPR club from 1885 to 2001. QPROW has a 
section covering the QPR’s history from 1887 to 1997. QNET and BBC have no section 
dedicated to the history of QPR. As DUW covers the history of the club over the 
longest period, the Better rating is given to this section. 

Equinferior (=). An Equinferior rating can be given to an instance where the author 
believes the information found on the PHP is either equal to or inferior to that on the 
non-PHPs. This is the other side of the coin. If the PHP is not better, it is either the same 
or worse. This covers the same/worse aspect. This rating can also be given if the 
information within the PHP section is non-existent (this can occur when a PHP has a 
section to a dedicated topic, but the section is incomplete or unavailable). The symbol 
for Equinferior is <. 

An example of the Equinferior rating is the Latest subsection within the Latest 
News section within the Latest area of DUW. This subsection covers QPR news, and 
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Figure 3. 
The Robin’s Nest 
(Cheltenham Town) 
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Figure 4. 

Steve’s Walsall pages aka 
UpTheSaddlers.com 
(Walsall FC) 


Table IM. 
RICR ratings 





Aj Steve's We als all Football Club pages - Microsoft. Jaternet Explorer : 
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country against Virginia Beach, an American non- 
league side, Tha match was played as a warm up tù 
the Confederations Cup, in which Naw Zealand will 


iarr stanga bow. when he could ! 
have gone victually anywhere aise i 
ut. 
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Lets hope he makes H. or hall gat 
[squashed Into chick peas, n- l 
RUCHLEY 18 BACK Frusbaied 4 
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the fme, unless you count a short, A 


and Anderlecht — will be a key player If his country 
are to progress in France. 
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ee; a 
i 15:15 f 
Relative information RICR 
content rating letters Explanation 
Unique X Provides a resource none of the “competitors” provide 
First F Started providing the service which the “competitors” also 
provide but began before them 
Better B The author believes that the service the PHP provides is 
significantly superior to that provided by the “competitors” 
Equinferior = The PHP provides an inferior or equal service, or does not 
compete at all in the category 
Not applicable N/A The comparison is not valid 





the available news dates back to 17 August 2001. AIl the non-PHPs also have sections 
on QPR news. QPROW has news from 16 June 2002, QNET since 9 March 2000 and the 
BBC since 29 September 2000. Since DUW does not cover a period of time covered by 
the BBC (or even QNET), the rating given is Equinferior. 

Not applicable (N/A). Any instance, where the comparison of the sections should not 
be carried out, the RICR given will be Not Applicable. The web sites will often have 
complete sections which are in no way related to their usefulness as an information 
resource. When this occurs the Not Applicable rating will be given. The symbol for the 
Not Applicable rating is N/A. 
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A good example to demonstrate such a case is in the Credits subsection within Links 
on DUW. This subsection is about people who have at certain times helped the 
author/creator of DUW with the site. The non-PHPs being compared to DUW may or 
may not have equivalent sections as part of their sites; however, as this has no bearing 
on how useful any of the sites are as an information resource, the comparisons are 
never made. Thus, the rating given is Not Applicable. 


Comparison results 
This section contains the tables (Tables IV-X) of results from the comparisons made, 
using the RICRs defined above. For acronyms and abbreviations, see Table X. 


Checking uniqueness of information on the web 

The next stage involves determining whether the discovered instances of Unique (X) 
information are available elsewhere on the web. This is usually a two step process (the 
procedure ends when the information is located, so the number of steps taken depends 
on. how quickly the information is located) that consists of collecting all the instances of 
X (Unique) RICRs given, and carrying out searches on the web to find them. First, a 
search is carried out on Google to find the specific instance of the information. If this 
yields no positive results, the same search is carried out on a meta-search engine (i.e. 
Dogpile). 

This procedure was carried out with Dave’s Unofficial QPR web site, and the results 
are in Table XI. To illustrate further, the details of the first search carried out will be 
discussed. 

This search was for the player ratings for QPR players, for the current season. The 
search terms used were: 


+ QPR + “player ratings” 


Google returned 378 results and Dogpile returned 54 results. Of these results, the first 
300 from Google were checked, and the first 30 from Dogpile. Though much looked 
outwardly promising, the player ratings required were not found. 

The full results for Unique instances of information found on the DUW’s are 
available in Table XI. 

From Table XI it can be seen that 62.5 per cent (5 out of 8) instances of Unique 
information were not found using search and meta-search engines, and 66.6 per cent 
(2/3) of the instances which were found, could be found on other PHPs. 


Correctness of information 
During the comparisons, the correctness of the information was also checked. For 
every web site checked, two numbers are required so as to be able to check the 
information randomly. The first number is a starting position, and the second number 
is a “skip”. In the QPR example, the first number was 5, and the skip was 6. Therefore, 
the first instance checked was the League Table (the fifth instance from the top), and 
after that every sixth instance was checked (as the skip was 6). These random numbers 
were generated by an online random number generator (Random.org, 2004). The 
information was then put into tables (see Table XI). 

In any instance where the section chosen cannot be checked for whatever reason, the 
next one available on the list was checked, and this will not effect the position of the 
skip. 


d 





pages 
75 


Personal home 





(002-200 souls) 
SLOG ‘SLOD ‘(2002-1002 
VIN auts) SLOd 73304 Ag 
(002-2002 aus) 
WLOUWIRIN WLOS ‘(2002-100 
VIN auts) WLOd :3104 Ag 
Aquo 
UOSBIS JUƏLMI ‘SI91OIS JO JSVI Y/N 


V/N Aydesdorq poys “qsveas o1seg 

IWNLON] 

(ongea; Jo dno yp Jre) (ngea Jo dno ym) sdde 

sTeos ‘spied MOJPPÁ ‘Sped pay ‘SJLOF ‘spied MOJPPÁ ‘spied pN 


V/N 


VIN 
€002-2002 
‘2002-1002 ‘Aquo sia109s doy, 


Aydersorq Yoys ‘Srs seg 
(angea 

10 dm ym) sdde ‘sjeod/sjeeys 
UL ‘SPILI MOJA ‘Sp poy 





uonoəs Yqd ~ Kods Jqq Woo ~UYIo 


aS AdO TePWO 





Table V. 
QPR 


PHP comparisons for 


smouoy [eUORUIE}UT ‘sqnjo 
snotaeid Ani Od ‘peusis ‘uontsod Jəqumu penbs :S784S PU Ydo, 43m P431 ‘Smouoy ‘pouss ‘JOA SPN :84S AdO [PHYJO, ON 


UOSLƏS 
OO6T-666T SUIS ATOd 3704 4g A 


UOSLaS 0007-6661 VUS WLOdJA 


WLOD ‘WLOd Usop 4q X 
UOSLƏS 


Z6GT-TG6T SUIS S19109S JO JS] A 
AydesZo1q ysnosoy] APA g 


WLOWN 
‘Sunjel aseroar ‘sjeos ‘sddy X 


aus Yad [eOjoun saaeq ANMA 





speme 
Alea, 


speme 
AT UOJ, 
SJƏ109S 
dol 
soyoid 
STACY 


s}e]s 
SPAR 


srafheld 
sUOT}Iag 





57,1 


76 


Table VI. 
PHP comparisons for 
QPR 








Checking the correctness of information 
In order to verify the correctness of the information, the instance of the information is 
checked against the same section in another web site. The web site chosen would be 
known to contain the information either because of the PHP comparisons, or the 
post-comparison checking of the availability on the web. 
A section of a web site might contain any number of facts or pieces of information. It 
woiild not be feasible, given the size of this project, to check every individual fact. As 
such, only seven “facts” are checked from any section. 

Once compared, a “degree of correctness” rating is given, according to the number 
of correct or incorrect “facts” checked (see Table XII). 


Results 

As. mentioned already, the pilot study only involved four PHPs in total. A team 
from each English professional football division was taken, and the “best”[9] PHP 
was found for this team. These four PHPs were each then compared to three 
non-PHPs. These were always the BBC Sport section on the team in question, the 
official team web site, and an “organised” unofficial site[10] on the team in 
question. At this stage of the study, these preliminary “sample” results should aid 
in the understanding of the project results (in the future) as a whole (see 
Table XIV). 

In the study, there were a total of 69 sections/sub-sections cf PHPs examined and 
compared. This figure is the total number of sections and subsections that the PHPs 
consisted of. The sites that they were compared to (i.e. the non-PHPs) could have 
contained more or less subsections. 

The comparisons revealed a number of points. Overall, this pilot study suggests 
that at least these PHPs offered a good deal of “unique” information. Out of the 69 
instances, 23 contained some kind of original information. This is exactly a third of 
total subsections of the PHPs. In addition to this there were six instances where the 
information provided by the PHP was available from it before it was available 
from the non-PHPs. There were also seven instances where the information 
provided by the PHPs was of a better quality than that of the non-PHPs. In total, 
there were 36 (52 per cent of the total PHP subsections) instances where the 
information provided by the non-PHPs was either innovative or superior to that of 
the non-PHPs. 

Looking at the same numbers on a case by case basis, we see that for every PHP at 
least 27.6 per cent of the subsections, (over a quarter) contained some unique 


Section: Dave's unofficial BBC sport — QPR 
history RICR QPR site Official QPR site QPRnet.com section 
Hall of fame B 17 players with Two players atany N/A N/A 
biography one time (Loftus 
Legends) 
Club history B 1885-2001 - 1887-1997 N/A N/A 
Results F 1996-1997 until Current season N/A Since April 2000 
present (2002-2003), and 
' 2001-2002 season 





m 











v w iB 
E N z5 
Os 2s 
SA a2 
E mB 
8 8 
T 3 
£ ay 


aIQRreae py usq uaas pey zinb ou ‘ZULA jo IUN ye IIAIMOY ‘UOTESGNs (UOTIeS IAPLIJJUI IYJ U1) zmMb 
E SEY JIS ay L, ‘(EPI Aue uo) ajquireae jou sem peu dng Wo Z86T I3 ‘doys qnyo IP ut a[qe[teAe astpueyoJeU aU} PPE POs INS oy} YFNOYY UdAG,29}ON 


synser jjod snorasid 


uoys ayeredas ON spreme ÁA pue Apuy, uogos ayeredas ON PI pue əajqeeae jjod jueaim9 S S[]Od 
V/N gS°A V/N zud pw zinb uoHsenb aay jews X ZING 
osIpueY IEW pasIpuRyJoU [asipueyoreul 
oyads-YydO ON aspureyoru syLeds-YIO ON doys qno ydO Teug dno ya ger om gi er X Ud] WOW add 
SIdARl 

Y/N (Steed GT) suonoaps JuarATIp GZ VIN 14819) ‘suoyosjas JuaTayIp LT = dede M 

(suoKda][00 9z) UoseəS £007-Z007 IM səyqgew y 
oysds-ygd JON SULINp sjuaAs JO SUOTDaTIOD amjoig Y/N pue sided oy oz suoqoayjooampig = saseul] 


Ydd — vods 9gq 





o7,1 





Table VII. 
PHP comparisons 


for QPR 


Table IX. 
PHP comparisons 
for QPR 


Table X. 
Acronyms and 
abbreviations 





information. This pilot shows that at least the possibility of finding unique information 
on PHPs is quite high (see Tables XV-XVIID. 

The uniqueness of instances of information found on the PHPs is quite high, as 
Figure 5 shows. The fact that less than a fifth of the instances given the Unique RICR 
turn out not to be truly unique reinforces the ideas of the hypothesis. It must be stated 
that the sample is too small to serve as conclusive evidence, but it is a “positive” 


guideline. 

Section: Dave's unofficial 
links RICR QPR site 

QPR links B21 links 

Football B17 links 

Links 

Other links N/A Not relevant 
Section: Dave's unofficial QPR 


about RICR site 


Web N/A Not relevant 
site 


history 

Stadium X Directions, stadium 

guide layout and history, 
ticket prices, pubs, 
parking 

Contact X Address, phone 

the club numbers, e-mail 
addresses 

Credits N/A Not relevant 

Acronyms/abbreviations 

Apps 

Subs 

MOTM 

POTY 

POTM 

GOTM 

YPOTM 

MatchOTM 

POTS 

GOTS 

MatcnOTS 


Official QPR site 


N/A 
N/A 


Official QPR site 


Directions, 
stadium layout, 
ticket prices 


Address, phone 
numbers 


BBC sport — QPR 


QPRnet.com section 
15 links 1 
11 links No separate 
section 
BBC sport 
- QPR 
QPRnet.com section 
Directions, ticket prices, N/A 


match day programmes, 
fanzine 


Address, phone N/A 
numbers, names of staff 


Explanation 


Number of appearances 
Number of times used as substitute 
Man of the match 

Player of the year 

Player of the month 

Goal of the month 

Young player of the month 
Match of the month 

Player of the season 

Goal of the season 

Match of the season 


ya 








The information is either completely incorrect, or 
incorrect enough to be unusable. For the information 
to be deemed unusable, it must be less than 50 per 


The information is more than 50 per cent correct but 


URL (if available) 


www.gpr-mad.co.uk 


http://web.onetel.net.uk/ 

~ carlholl/ 
http://dspace.dial_pipex.com/ 
town/park/yfh45/qpr.htm 


Web site used for 
verification 

BBC sport 

Official QPR web site 
Official QPR web site 


N/A 


f Availability 
DUW section Description of instance (type) 
Latest — match reports Player ratings No 
~ current 
Latest - match reports Player ratings No 
— archive 
Players — players Average ratings No 
Players — monthly Young player of the Yes 
awards month (non-PHP) 
Interactive - QPR CD containing the 1982 No 
CD-ROM FA Cup final 
Interactive — quiz QPR quiz Yes (PHP) 
About — stadium guide Where to find parking Yes (PHP) 
About — contact the Club e-mail address No 
club 
Degree of 
DUW section Description of instance correctness 
Latest — league League table High 
table 
Players — player Additional informationon High 
profiles players 
History — results Previous results of football High 
. matches 
Links — QPR links Links to other QPR sites Medium 
Number of errors 
Degree of correctness (out of 7) Description 
Low 4 or more 
cent correct 
Medium 2or3 
not in the high category 
High 1 or less 


The information checked is all correct, or almost all 
correct. Information is deemed to be “almost all _ 


correct” if only one error is found 
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Table XI. 
Unique instances of 
information (QPR) 


Table XI. 
Correctness of 
information (QPR). 
Starting point 5. Skip 6 


Table XII. 
Degree of correctness - 
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Table XIV. 
Total RICRs for pilot 
study 


Table XV. 
RICRs for Newcastle 
United 


Table XVI. 
RICRs for Walsall 


Table XVI. 


RICRs for Queen’s Park 


Rangers 


Finally, the information on the PHPs appears to be correct the great majority of 
the time (see Figure 6). Though this sample for checking the correctness of the 
information is also quite small, the number of instances with a “High” 
correctness rating is significantly high (13 out of 14 instances in total had a 


“High” rating). 


RICR 


Unique (X) 
Equinferior (=) 
Not! applicable (N/A) 
Better (B) 

First (F) 

Totals 


RICR 


Unique (X) 
Equinferior (=) 

Not: applicable (N/A) 
Better (B) 

First Œ) 

Totals 


RICR 


Unique (X) 
Equinterior (=) 

Not applicable (N/A) 
Better (B) 

First Œ) 

Totals 


RICR 


Unique X) 
Equinferior (=) 

Not applicable (N/A) 
Better (B) 

First F) 

Totals 


Actual number 


23 
21 
12 
7 
6 
69 


Actual number 


WOoOrROAnN A 


bee 


Actual number 


be 
Dorno 


Actual number 


O om a ww N OO 


bo 


20.7 
100 


‘= 
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Initial conclusions 

So far in our study the overall discoveries made suggest that there is a possibility that 
PHPs make a significant contribution to the overall informational value of the WWW. 
Not only are there contributions made in terms of adding informational content to the 
WWW, but there are also contributions in “innovational” terms. The fact that certain 
informational services provided by the PHPs were introduced before the non-PHPs 
adapted them implies that PHPs are at times leading the way in popular ideas for 
information dissemination. 

However, the informational contribution should not be underestimated either. In 
1999, 2 per cent of the entire “publicly accessible web” consisted of PHPs (Lawrence 
and Giles, 1999). This amounted to some 16 million PHPs. Obviously, the 
demographics have changed since then, and it must be made clear at this point that, 
at the time of writing, there is no specific figure available for the exact number of PHPs 


RICR Actual number % 





Unique (X) 5 55.6 
Equinferior (=) 4 44.4 
Not applicable (N/A) 0 0 
Better (B) 0 0 
First (F) 0 0 
Totals 9 100 


Uniqueness (n = 23) 


Available on non-PHP 
17% 





Available Unique 
on PHP 61% 
22% 


Correctness (n = 14) 


Medium OLow 
7% 0% 
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Table XVII. 
RICRs for Cheltenham 
Town 


Figure 5. 
Uniqueness 


Figure 6. 
Correctness 
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on the web. However, this is not to say that the number of PHPs has reduced either. As 
such it is still too early to pinpoint the absolute informational contribution of the PHPs, 
but at this stage, most of the evidence points to the actual amount being notable rather 
than negligible. A clearer picture should be painted on the completion of this project. 


Further study 

As mentioned earlier, this project aims to discover the contribution of PHPs to the 
WWW in terms of informational content. As such the focus is on discovering what sort 
of contribution the PHPs make. However, this project will not be able to answer certain 
questions which will need to be answered in the long term to show definitively the 
value of PHPs. 

This project will not be looking at finding a figure for the total number of PHPs, 
and therefore will not be able to give an absolute figure to the amount of original or 
innovative content found on PHPs. In addition, the focus of this study is on PHPs in 
the field of professional football in England. Though the fact that four divisions with 
varying levels of interest are to be examined will give a broader picture of the 
differing levels of PHP contribution, more fields will need to be examined to get a 
clearer picture. 

In the long term, it is crucial that these points are clarified for the informational 
communities to ensure the efficient usage of all the resources of the WWW. This 
efficient usage can benefit not only the academic researchers, but also the general 
public who only stand to gain by the additional resources at their fingertips. 


Notes 


1. By “original information”, we mean information which is not otherwise available on the 
WWW or which is available only on other PHPs. 


2. This second aim is part of a longer term project to decipher more accurately the 
informational content of the PHP. 


3. BBCI, BBC3, ITV1, IT V2, FIVE, British Eurosport, Sky Sports 1, Sky Sports 2, Sky Sports 3, 
Sky Sports Extra, PremiershipPlus, MUTV, Chelsea TV. 


4, Based on the names of programmes with the word “football” in the title. Average in the 
month of April 2004 (exact figure: 18.05 hours per day). 


5. A search for programmes containing the word “football” in the title was carried out and 
. revealed 30 (exactly 29.52) programmes a week on average in the UK. The search was 
carried out in April 2004 (total number of shows was 124). 


6. Three out of the following four football magazines were found in ten London newsagents 
(ten newsagents were checked in total): FourFourTwo, Shoot (Monthly), Match Magazine, 
and World Soccer. 


7. Tf one of the following is true, the web page is considered to represent a person rather than a 
group: 
(1) The author of the web page has made references to the web page as “my web page”. 
(2) The web page has a single name in the “Credits” section of the web page. 
(3) The author of the web page sends an e-mail to the researcher claiming that the web page 
represents a single person. 


8. The + ensures that the documents returned contain the words following it. The “quotes” 
then treat the “words between the quotes” as a set phrase. “my” has been used because it is 


h 








indicative of a PHP where an author often writes, for example, about “my” Queens Park 
Rangers site. 


. By “best”, we mean the site which looked at first sight to contain the most information and 
therefore be the most suitable to help develop a methodology. Though it is not possible to 
accurately decipher how much information is contained on a web site without careful 
examination, this was left to the discretion of the researcher. 


10. By “organised” unofficial web site, we mean a site which is clearly a non-PHP and is 
affilizted with an organisation which has sites for every football team. Examples of sites like 
these are Footie-Mad rivals, or Sports Network. 


co 
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Abstract 

Purpose ~ To determine and compare approaches to the education and training of librarians for 
work in digita! libraries. More precisely, to identify ~ in general terms, rather than specifically — the 
important competencies required by information professionals in creating and managing digital 
libraries, and in facilitating their use, and to assess how these competencies are treated in LIS 
education and training, and therefore how the capacities of the information professions are being 
developed. 

Design/methodology/approach — Literature analysis of the skill sets required by librarians 
working with digital materials. Evaluation of formal education and of professional development 
programmes in the UK and in Slovenia, to assess how these needs are being met. 

Findings — Both formal education and continuing development training are adapting to cover 
aspects of the digital library environment, both in the UK and in Slovenia. This is happening as 
part of the normal process of the redesign of degree programmes and of training courses. Digital 
library skills and knowledge — embodying conceptual, semantic, syntactic and technical aspects — 
are being included in existing courses, for the most part, rather than in entities labelled “digital 
library”. This approach has strengths and also weaknesses. While there is some agreement on core 
topics, there is much variation in how they are presented, and in the relative importance given to 
them. . 

Research limitations/implications — Based on comparison of education and training programmes 
in two countries, the UK and Slovenia. 

Practical implications — Recommendations for curricula are made. 

Originality/value — Provides an insight into education and training needs in a developing and 
important area. 

Keywords Digital libraries, Education and training, Professional education, International standards, 
United Kingdom, Slovenia 

Paper type Literature review 


Introduction 

Library and information science has always been concerned with the collection, 
organization, storage and retrieval of materials and information, in order to respond to 
users’ queries. It has also often been noted that new technologies for the generation, 
distribution, processing and storage of information have brought changes in the 
nature, volume, and format of that information. The digital library is only the most 
recent of these. 
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A question which arises is how these developments influence the demands that are 
made of information - professionals and, consequently, what this means for their 
education. In order to be effective and efficient in their job: 


‘{T]oday’s information professionals need to learn more about opins information 
‘retrieval, but at the same time they need to learn. the theory, tools and techniques behind the 
‘traditional approaches to organizing and processing information, much of which will be 
‘applicable in the storage and retrieval of electronic information in digital libraries 
(Chowdhury, 1999, pp. xv). 


And, as Deegan and Tanner (2001, p. 225) point out: 


‘In developing skills for managing, creating and providing services in the digital environment, 
‘training and education will become ever more important. There will be increased need for 
‘educational organizations to inform students of the new realities and the new skills that they 
will need in the digital environment. 


The skills and competences required for the digital library have been discussed by a 
number of authors, including Sreenivasulu (2000), Chandler (2001), Prytherch (2001), and 
Chowdhury and Chowdhury (2003). They are wide-ranging, including: creating search 
strategies; evaluating web sites; guiding and training users; integrating networked 
sources; analysing and interpreting information; creating metadata; imaging and 
digitising; designing interfaces and portals; project management; and many more. 

.The fact is that there are parts of the “traditional” library school’s curriculum that 
are still very much relevant in today’s fast changing computerized information world. 
The role of digital librarians may still be, as Marcum (2003, p. 276) puts it, “stewards of 
the world’s intellectual and cultural heritages”. 

‘In this paper we try to identify these “perennial” topics, and to connect them to the 
contemporary issues and technologies, which should, in our opinion, be reflected in the 
education of information professionals, at present and in the future. The paper has two 
main aims: 


(1) to identify — in general terms, rather than specifically — the important 
competencies required by information professionals in creating and managing 
digital libraries, and in facilitating their use; and `: 


(2) to assess how these competencies are treated in LIS education and waiting: and 
therefore how the capacities of the information professions are being developed. 


The digital library 
Another current debate is that about the meaning of “digital library” itself. The term is 
used to denote a number of concepts and situations, and also has a’ number of 
more-or-less synonymous terms: electronic library, virtual library, library without walls, 
networked library, complex library, etc. (see, for example, Bawden and Rowlands 1999). 
.|A useful pragmatic explanation of the. term is that used by the UK’s INSPIRAL 
project, which investigated the linking of virtual learning environments with digital 
libraries (http;//inspiral.cdlr.strath.ac.uk): 
‘A digital library provides digital resources and services. The resources may be in various 


_ digital formats. The services are based on traditional library skills, enabling materials to be 
evaluated, organised, stored, retrieved and used. In some cases, preservation of special 


ys 








collections and access to those collections is part of their remit. Unlike hybrid libraries, digital 
libraries are not dependent on any physical location, but they do provide a single online 
access point, and provide access to remote resources as well as to their own collection. 


This explanation neatly encapsulates the link between skills required in the traditional, 
and the digital, library. 

For the purposes of this paper, however, we will make use of a more complex 
two-part definition of a digital library, which is given in the material from a workshop 
on the social aspects of digital libraries (Borgman et al, 1996): 


Digital libraries are a set of electronic resources and associated technical capabilities for 
creating, searching, and using information. In this sense, they are an extension and 
enhancement of information storage and retrieval systems that manipulate digital data in any 
medium (text, images, sounds; static or dynamic images) and exist in distributed networks. 
The content of digital libraries includes data, metadata that describe representation, creator, 
owner, reproduction rights, and metadata that consist of links or relationships to other data or 
metadata, whether internal or external to the digital library. 


Digital libraries are constructed — collected and organized — by [and for] a community of 
users, and their functional capabilities support the information needs and uses of that 
community. They are a component of communities in which individuals and groups interact 
with each other, using data, information, and knowledge resources and systems. In this sense 
they are an extension, enhancement, and integration of a variety of information institutions as 
physical places where resources are selected, collected, organized, preserved, and accessed in 
support of a user community. These information institutions include, among others, libraries, 
museums, archives, and schools, but digital libraries also extend and serve other community 
settings, including classrooms, offices, laboratories, homes and public spaces. 


Such definitions are useful in grounding discussions of digital library issues see, for 
example, Chowdhury (2004, ch. 22), Borgman (2003), Marcum (2003), Bawden and 
Rowlands (1999), and Rowlands and Bawden (1999). They are used here as the basis 
for a discussion of the competences required by information professionals working in 
such environments. 

The main differences between these and “traditional”, or pre-digital libraries, as we 
see them (following Rowlands and Bawden 1999), are: 


(1) A change from ownership to access. The library no longer only provides 
materials which it owns, besides this it provides access to digital networked 
resources beyond its physical location. This immediately brings with it the 
changed nature of the library as a physical setting. Also the librarian’s 
competences change, including, besides the traditional competences (eg. 
knowledge organization, metadata), the competences connected with the 
information and communication technology (e.g. knowledge about the structure 
and functioning of the computerized systems, like internet searching, 
techniques for evaluation of information, web pages design, etc.). 


(2) A change from known item access and physical browsing to search and 
navigation, both in collections and in individual items, and in the nature of the 
library as a “place”. Within the context of the digital library it is no longer 
possible to physically browse through a library collection or an individual book. 

' This means that the librarian needs additional competences besides those 
connected with knowledge about the physical arrangement of library collection 


Training for 
digital librarians 


87 


57,1 


88 





and kinds of information resources (e.g. bibliographic aids). This requires 
competences about information retrieval systems, their structure, retrieval 
levels, commands and retrieval techniques, etc. 


Different information resources can appear very similar due to the same interface (e.g. 
that of a browser), which has a “homogenising” effect. For a librarian it is important to 
help users to distinguish between different kinds of information resources, especially 
their functions and purposes. This is particularly so as new forms of resource ~ web 
rings, blogs etc. — appear. 

As libraries become “more digital”, it is also necessary to re-think the idea of the 
library as a place. Does the idea of a library imply a physical location if so, what is this 
used for? A store or archive? A quiet place for study and reflection? A stimulating place 
for creative innovation? There are many possible answers, and the best solutions still 
have to be worked out. 


(3) Changing expectations of users. These issues do not only concern end-users, but 
are also important from the librarian’s point of view carrying with it the changes 
in the education process. In the past information systems were designed with the 
expectations that people would adapt to them. However, with the greater 
emphasis on user friendliness and usability it is no longer so. Borgman (2003, p. 
89) says that the users have “higher expectations of information systems”, that 
the systems should be easy to learn, use and relearn, as well as flexible in 
adapting to a more diverse user population. 


User expectations have to be handled carefully. Many users will come to believe that 
“all information” is available to their PC, can be found through simple Google-like 
searching, and will always be up-to-date. It is important to convince them that it is 
worth looking for printed material, and for material available through specialised 
search systems. It is also necessary to be realistic: digital libraries are often better at 
providing metadata records, and location information, than at giving full-text of 
everything. This leads to the important role of the librarian as facilitator and helper; 
but the facilitation and help must be very realistic. 

Borgman (2003) also emphasizes that digital libraries will never be as easy to use as 
automatic machines or one-purpose technologies, and that working with them 
effectively requires some learning. Complex cognitive tasks are involved in the work 
with them. On the one hand “workers, learners and users need to understand a variety 
of general computing concepts as well as concepts and skills specific to applications” 
(National Research Council, in Borgman, 2003, p. 99), and on the other there are skills 
connected with information needs and information seeking behaviour. These are 
strongly connected with problem solving behaviour which can be divided into four 
steps, i.e. the four cognitive processes initally identified by Polya (Borgman, 2003): 

(1) Understanding the problem. 

(2) Planning a solution. 

(3) Carrying out the plan. 

(4) Checking the results. 


Planning a solution is the most complex of these processes, varying according to three 
factors: degree of problem definition; amount of expertise in the problem domain; and 








knowledge about the resources and operations available to solve the problem. 
Depending on these features, users act as “novices” or “experts”. The latter “use a 
combination of system features, often taking an iterative approach that tests multiple 
strategies for finding the information sought” (Borgman, 2003, p. 103). They show a 
sophisticated combination of knowledge and skills, which help them search effectively 
and efficiently. These techniques can be taught to novices, and can be, to an extent, 
incorporated into information systems as functions (e.g. pre-coded search tactics and 
pre-defined search limiters or expanders). 

On the basis of these considerations, Borgman (1986, 1996) proposes a model for 
knowledge and skills the users and/or librarians will need. It includes: 


* conceptual knowledge; 
* semantic and syntactic knowledge; and 
e technical skills. 


Conceptual knowledge denotes the user’s model or understanding of the type of digital 
library with which they are dealing. The conceptual knowledge of the search process is 
used for translating an information need into a plan for executing the search. The 
success of searching in digital libraries depends heavily on the user’s ability to 
construct a mental model of the given information space (Dillon, 2000; Dillon and 
Gabbard, 1998). 

In contrast with the conceptual knowledge which is used to plan and refine 
searches, semantic and syntactic knowledge is more concerned with details and with 
individual systems or applications. Semantic knowledge means knowledge of the 
operations available to execute a search plan. Shneiderman (1992) defines syntactic 
knowledge as the user’s understanding of the commands or actions in a specific 
system. The expert user possesses the following semantic and syntactic knowledge: 


e understanding of the general characteristics common to most information 
systems; 


* understanding of the features characteristic for specific kinds of information 
systems; and 


e ability to become familiar with the features of a new system, and then to adapt 
the search strategy to these features. 


By “technical skills” it is meant those basic computer skills which are prerequisite for 
developing conceptual, syntactic, and semantic knowledge, within the context of a 
digital library. These skills include knowing how to use computer devices and being 
familiar with the digital conventions, e.g. screen display. It is important to be aware 
that the levels of these skills in the population are very diverse, and also that different 
applications and systems require very diverse levels of these skills. Borgman (2003, 
p. 109) states that “as digital libraries are designed for more general audiences, a 
broader range of skill levels will need to be accomodated in many applications”. 


Analysis of the programmes of formal and continuing education in Slovenia 
and in the UK - 

In order to assess how the knowledge and skills required for the digital library, as 
identified and categorised above, are currently encapsulated in LIS education and 
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training programmes, a comparative study was carried out. This compared the 
situation in the UK and in Slovenia[1], as represenative of a relatively large country 
with a long tradition of LIS education from a variety of providers, and a small and 
relatively newly independent country with a more limited range of provison. The study 
was based on a qualitative analysis of courses, in order to: 


* analyse the contents of education and training offered in both countries; 
* identify the digital library contents in the curricula; and 


* derive from these the competences for the digital library in accordance with the 
Borgman three-part model of knowledge and skills. 


This approach looks at what is currently provided, and provides a pragmatic 
counterpart to the long lists of potentially useful topics and skills mentioned by writers 
in this area. 


Slovenia 

We looked at the digital library skills and competences which are incorporated in the 
curriculum of the Department of Library and Information Science and Book Studies at 
the University of Ljubljana, and the CPD training courses offered by the National and 
University Library and the Institute of Information Sciences. 

Formal education. In the area of formal education in LIS, the Department of Library 
and Information Science and Book Studies at the University of Ljubljana is the only 
institution providing this kind of education in Slovenia. Undergraduate education has 
had a relatively long tradition, in recent years being extended by postgraduate masters 
and doctoral studies. We identified seven undergraduate (Table D) and eight 
postgraduate courses (Table I) offering some digital library content. 

Continuing education. Two institutions offer continuing education from the LIS field in 
Slovenia: the National and University Library in Ljubljana (NUK), and the Institute for 
Information Science at the University of Maribor (ZUM). They mostly offer one- to two-day 
courses dealing for the most part with conceptual and syntactic and semantic knowledge: 


(1) NUK. Six courses were found to offer digital library content (Table I). 
(2) IZUM. Among the courses four have digital library content (Table IV). 


Discussion. Conceptual knowledge is mostly present in the contents of the introductory 
courses which are aimed at acquiring and shaping the model of the digital library. This 
kind of knowledge enables the development of competences connected with the 
planning and refining the use of the digital library, i.e. information problem shaping, 
and planning and refining the search process. 

In the courses where conceptual knowledge has been introduced, it is then possible 
to extend it to semantic and syntactic knowledge. This means the development of 
competences for understanding the operations and commands, or actions, to execute a 
Search plan in specific digital library environments. 

To be able to achieve this, specific technical skills are taught, eg. use of 
keyboard/mouse and other hardware devices, and use of basic computer applications, 
such as operating system and word-processing and spreadsheet software packages. It 
should be mentioned that these skills are mostly acquired prior to entering the faculty, 
therefore not much emphasis is given to them in the curriculum. They are taught only 
in the first year of undergraduate study. 
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Table I. 
Ljubljana University, 
postgraduate modules 








Courses Content Knowledge/skills 
Organization of information Organization and sorting of Conceptual knowledge 
information with the purpose to 
increase accessibility 
Theory of information sources Typology of information Conceptual, semantic and 
and services sources and their practical use syntactic knowledge 
Theory of information Information asa phenomenon Conceptual knowledge 
Databases, data structures and 
basic programming : 
. Organization of data Conceptual knowledge 
Search techniques in various Semantic and syntactic 
databases knowledge 
Basics of programming Semantic and syntactic 
knowledge 
Database design Introduction to the process of Conceptual, semantic and 
the creation of databases syntactic knowledge 
Visual information in digital Presentation, organization and Conceptual knowledge 
libraries protection of image data, digital Semantic and syntactic 
video recordings knowledge 


Information retrieval systems 


Retrieval of digital visual 
materials 
Sorts of systems 


Conceptual knowledge 


Process of systems design Semantic and syntactic 
knowledge 
Digital libraries Collections of e-sources and Conceptual, semantic and 
technologies for creation, syntactic knowledge 
searching and use of this 
information 


In postgraduate study, all three kinds of knowledge and skills are present throughout 
most courses. This requires more complex competences necessary for planning and 
refining of the retrieval process, and for executing this plan. In addition, postgraduate 
study also incorporates the development of complex competences for some aspects of 
the creation of information sources. 

In the continuing development area, courses were found to cover conceptual, 
semantic and syntactic knowledge, but within specific areas, dealing either with a 
narrow segment of information sources and developing specific competences within 
these (e.g. use of e-Journals or OPAC), or dealing with specific technical skills in 
connection with the use of computers (e.g. the ECDL course). The continuing education 
programmes develop specific competences which may not have been emphasized 
enough in the formal education programmes. 


UK 

LIS education in the UK is provided on a larger scale than in Slovenia, and a similarly 
exhaustive study was not possible. A total of 16 undergraduate and 47 postgraduate 
programmes in 18 universities are accredited by CILIP, the UK’s professional LIS 
professional organisation. Postgraduate course provision at two universities — City 
University London, and Sheffield University — was examined in detail, while a 
confirmatory study was carried out from the web sites of other institutions. The 
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Table IV. 
Maribor, IZUM, CPD 
courses 











Course Content Knowledge/skills 

COBISS/OPAC Introduction to databases Conceptual, semantic and 
included in the COBISS/OPAC, syntactic knowledge 
and their use. 

Use of full-text databases Introduction to full-text Conceptual, semantic and 
databases, and their use syntactic knowledge 


Use of Web of Science service Introduction to Web of Science Conceptual, semantic and 
databases (citation indexes), and syntactic knowledge 
their use . 

Advanced searching in segments Advanced IR techniques Semantic and syntactic 

COBISS/OPAC and knowledge 

COBISS2/cataloguing 





continuing professional development (CPD) programmes of the three leading UK 
providers — Aslib, TFPL and CILIP — were also examined. 

Formal education. Postgraduate courses are a more common route into the LIS 
profession in the UK than are undergraduate courses, and these were examined for the 
PG library courses offered at two universities, City University London and Sheffield 
University. Presented in the same way as for the Slovenian case, those courses (referred 
to:as “modules” in the UK context) with particular relevance to digital libraries are 
shown in Tables V and VI. 

It will be seen that conceptual aspects of digital libraries enter into several courses; 
technical skills are covered in a basic IT course. A specific “digital libraries” course will 
be introduced as part of a general redesign of City University’s postgraduate 
programmes in 2004-2005. 

Again, conceptual aspects of digital libraries enter several courses. 

This picture is confirmed by a less-detailed examination of other UK library 
courses. No department offers a programme in digital libraries as such, although there 
are some in related areas, such as “Networked Information Management” at the 
University of Central England. 

All offer courses in topics such as library management, metadata and knowledge 
organisation, information retrieval etc., which have clear relevance to the digital 
library, and all offer some form of basic IT skills training, which may be extended into 
areas such as information architecture and web-based systems. 

It is worth noting that programmes other than strictly library-related ones may fit 
graduates for roles in digital libraries — examples would be Information Systems, 
Information Management, and Electronic Publishing — but these have not been 
examined here. 

Continuing education. LIS continuing education in the UK is provided by three main 
providers: Aslib, TFPL and CILIP. All these are one-day or two-day courses, including 
conceptual, syntactic and semantic aspects, but not usually technical skills. Analysis of 
their current training programmes shows that none offer any course specfically 
devoted to digital libraries, but all offer some relevant courses, as follows: 


(1) Asko: 
* Organising digital information and knowledge; 
* Metadata; 
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Table VI. 
Sheffield University, 
postgraduate modules 





Academic and research libraries Includes integration of digital Conceptual knowledge 
resources, copyright etc. 

Access to information resources Typology of information Conceptual, semantic and 
sources, including electronic syntactic knowledge 
sources, and their use 

Collection management Includes digitisation and Conceptual knowledge 
electronic resources 

Practical computing IT basics. Organisation and Conceptual knowledge 
manipulation of data. Semantic and syntactic knowledge 
Web-based information Technical skills 

Libraries, information and Aspects of librarianship, Conceptual knowledge 

society including digital libraries 

Information searching and Sorts of systems. Effective Conceptual knowledge 

retrieval retrieval techniques. Metadata Semantic and syntactic 

i knowledge 





e Virtual learning environments and library/information services; 
* Electronic serials management; 

* Metadata; 

¢ Information literacy; 

* Strategic approach to internet research; and 

* Intricacies of internet search tools; 


(2) TEPL: 
* Information architecture; 
* Internet searching; and 
° Library portals; 


(3) CILIP: 
* Metadata. 


It is worth noting that several of these courses are new in 2003-2004, and that, for 
example, Aslib’s “Metadata” and “Organising Digital Information and Knowledge” 
courses have proved so popular that extra sessions have been arranged. This is an 
example of how a CPD provider can react quickly to meet a new need. 


Conclusions and recommendations 

We have seen that both formal education and continuing development training is 
adapting to cover aspects of the digital library environment, both in the UK and in 
Slovenia. This is happening as part of the normal process of the redesign of degree 
programmes and of training courses. Digital library skills and knowledge — 
embodying conceptual, semantic, syntactic and technical aspects — are being 


' “embedded” in existing courses, for the most part, rather than in entities labelled 


“digital library”. 


net 





In some ways this can be seen as a strength. It allows for a gradual and incremental 
development to meet new needs, and avoids the danger that relevant “traditional” 
skills and perspectives will be ignored. 

Conversely, it may lead to a piecemeal and partial approach, which may mean that 


some courses — and hence some students and participants — will fail to gain an overall - 


appreciation of the topic, and may lack some essential competencies and capacities. 
The “overall appreciation” aspect is perhaps most important, since it.is virtually 
impossible to cover all of the relevant skills and competencies within any one course. 
While it is clear that there is considerable overlap and convergence between the 
courses examined here, there is also considerable divergence, which may neither be 
planned nor positive. - 

It seems clear that there i is Benen agreement that certain. general topics are of 
importance: typology of resources and domain analysis; computer/network literacy 
and retrieval skills; analysis of information; user characteristics and behaviour; 
consciousness raising and information literacy; new forms of metadata; etc. But there is 
considerable scope for choosing which of these to focus on, and how to present them. 
While it is not desirable to recommend, still less impose, a closely- defined curriculum, 
some guidance might be helpful. 

One useful contribution could be for a generally agreed template or checklist of 
topics and competencies for the digital library to be drawn up, as a way of 
consolidating the several lists which have been published. This might best be done by 
national professional bodies, who might be expected to be able to’ adapt’ general 
principles to their local situation. More broadly, it might be a task for an international 


body such as EUCLID ‘(the European Association for Library and Information 


Education and Research). 


Note 
1. Based on a paper presented at Libraries in the Digital Age (LIDA) 2004 (Bawden, 2004). 
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Guest editorial 


Information and e-learning 

E-learning is a subject that has fast become a “hot topic” for politicians and others 
involved within the education sector. Rapid advances in technology and increasingly 
cheaper hardware have meant that the potentials of e-learning appear to be more easily 
realised. Both the Department for Education and Science (DfES) and the Higher 
Education Funding Council for England (HEFCE) released consultative e-learning 
strategies in 2003, illustrating that they regarded e-learning as more than just a mode 
of delivery but a key factor shaping the development and future of education for the 
next five to ten years. Universities too have been attempting to embrace the demand for 
new modes of learning facilitated via online technologies. As the Universities and 
Colleges Information Systems Association (UCISA) study of 2003 (Browne and Jenkins, 
2003) noted, now most universities have a campus-wide online learning environment 
and many have e-learning strategies. However, it is hard to find agreement as to what 
the actual nature of e-learning is. There is no common definition and even at times no 
common understanding of e-learning — does it encompass video and audio 
technologies; is it more than merely another way of delivering lectures to students; 
will it result in job losses and derelict campuses; what is the relationship between 
e-learning and content or knowledge management? Often institutional contexts and 
needs dictate the answers to these questions rather than a broader, common framework 
for the implementation and understanding of what e-learning means. 

For library and information professionals the concepts behind online learning are 
often not new. Staff working in this sector have been used to technology driving 
forward or leveraging change and are often more responsive to the benefits of new 
ways of working. This is in contrast to many academic staff who regard the notion of 
technology leading pedagogic development as an anathema. Often, however, rather 
than the technology or pedagogy driving the educational agenda, they work 
hand-in-hand to enhance educational opportunities for all. Familiarity with online 
journals and databases gives information professionals a head start in embracing the 
challenges and potentials of utilising e-learning within the curriculum. E-learning can 
represent a unique opportunity for library and information professionals to become 
more involved with learning and teaching activities. It can foster the creation of new 
roles and cross-disciplinary teams. Library and information professionals are pivotal 
to ensuring that academics and students capitalise on the new wealth of online 
resources available to them and look for new methodologies of enhancing traditional 
teaching activities through greater access to resources. They also play a key role in 
ensuring that all users of online learning have the basic skills required of them to 
embrace the flexibility and independent learning that online technologies encourage. 

In this special edition we have provided a snapshot of a number of key issues 
relating to online learning that will be of interest to the information professional. We 
move from considering changes in the literature and research on distance learning — 
now often conflated with online learning — to addressing issues of information 
management and institutional change. We also include articles which consider specific 
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case studies as illustrative of the forms of research currently undertaken into the use 
and development of online learning in the curriculum. This diverse selection of articles 
illustrates the wide impact of e-learning and the myriad of areas that it encompasses. It 
helps explain why we cannot have a single, common definition of e-learning by 
representing the complexity of interrelationships that e-learning has. If we are truly 
successful in embedding e-learning into our day-to-day activities as academics, 
information professionals or students, then perhaps in five years time we may not even 
be using the term “e-learning” at all. 

This special issue of Aslib Proceedings covers a multitude of issues related to 
e-learning. Williams and colleagues provide a review of the literature encompassing a 
number of these. It offers a short history of distance education, describing the media 
used, and reviews the literature on achievement, attitude, barriers to learning, and 
learner characteristics. The literature shows that using a variety of media, both to 
deliver pedagogic material and to facilitate communication does seem to enhance 
learning. Similarly, attitudinal studies appear to show that the greater number of 
channels offered, the more positive students are about their experiences. With regard to 
barriers to completing courses, demographic variables appear less predictive of 
completing an educational programme than life circumstances, attitude and the degree 
of social support received. 

In contrast to this broad overview, various case studies are presented. Ali and 
Proctor describe how a group of English language schools in Pakistan (the “City 
School” consortium) ensured that all its pupils had access to the most modern of 
information and communication technologies (ICT) courses. The article discusses the 
decision to implement its “ICT revolution”, the nature of the programme, how it was 
organised, the materials required and the outcomes of its implementation including its 
outstanding success with pupils and their parents. ICT courses now integrate, 
effectively, subject teaching in a variety of subjects including, for example, geography, 
science, and mathematics. Older pupils use standard computer application packages 
and ‘languages to develop more specific skills and abilities. Disadvantages are also 
acknowledged. These include the reduction in the school timetables of time given to 
subjects previously thought worthy of extended time. Inevitably, the children lose time 
previously given to such subjects as music, art, PE, etc. 

Kendall’s paper is focused more on one particular “e-system” and its impact. This is 
an interactive online tutorial aiming to improve student citing and referencing practice. 
An action research approach was used, involving three cycles of activity. The first 
used a checklist to identify the most frequently occurring errors made by users. The 
results showed a high number of errors, despite the instruction received by students, 
and the need to start the tutorial at an unanticipated basic level. In the second cycle, the 
students’ performance was compared before and after using the tutorial. Usage was 
monitored through WebCT tracking facilities and usability testing undertaken. Results 
informed the third cycle, involving adoption of the tutorial as the standard 
departmental practice, further support for staff and students and more use of WebCT 
for other teaching. Improvements were found all round, showing what Kendall 
describes as a “qualified success” in the use of online learning for this purpose. 

Quinsee and Sumner’s paper, also a case study, examines how introducing an 
institution-wide managed learning environment impacts on the processes of 
organisational change. City University was used as a case study interviews with 








leading members of the institution providing an exemplar of the change process 
and institutional plans for a future strategy. Research interviews with key 
decision-makers offered a valuable insight into the process of institutional change 
within the university. One of the most significant features of all these interviews 
was the recognition that institutional change is a complex evolutionary process. 
The over-arching message that came out was that the e-learning process, and the 
study itself, has made senior decision makers reflect on their role within the 
institution and the role of the managed learning environment (MLE) in this cycle 
of change — described in the paper as “one of the most truly revolutionary, 
unanticipated outcomes of the e-learning initiative”. 

Akeroyd discusses the practical content management of information resources 
within the context of emerging e-learning systems, examining some approaches and 
asking questions about the validity of current thinking. Akeroyd argues that much of 
the debate over this topic has been at a technical level and is focussed on the specific 
issues of ensuring interconnections between resources identified in the learning 
domain and those held in learning resource repcsitories and elsewhere. Much learning, 
he asserts, is unstructured and open, and that students within HEIs will continue to 
need library portals to enable access to the totality of information resources that they 
require. In the same way, tutors will continue to search in an ad hoc way 
comprehensive library collections. Thus content stretches from the unstructured to the 
dynamic and free form, while learning can be precise, directed and dependent on the 
one hand and open and content free on the other. 

Eynon reports the main findings arising from discussions with academics based in 
higher education institutions (HEIs) who use information and communication 
technologies (ICTs) for teaching and learning. The main themes discussed are: how 
ICTs are being used in academia; the motivations of academics to adopt ICTs in their 
teaching; the difficulties they have encountered when doing so, and the factors that 
may influence the further adoption of new technologies in higher education. 

A clear motivating factor for academics was to use ICTs to enhance the educational 
experience in some way and to overcome some of the difficulties associated with 
teaching ever-greater numbers of students. Academics highlighted the need for greater 
collaboration with software developers in order that future programmes and 
technologies would be developed that could accommodate the varied demands of 
educators. They felt they should have a greater role in shaping institutional strategies 
in this area; and a prescriptive “top down” strategy was thought to have a potentially 
damaging effect on the future adoption of ICTs for teaching and learning. 

Importantly, Eynon also explores non-use of electronic resources. She concluded 
that non-use was not simply down to practical issues, but that there may be other good 
reasons, such as a lack of student demand, and inappropriateness of subject matter. 
What is apparent is how important local context is in the use (or non use) of ICTs for 
teaching and learning and that academics need to be part of the process when 
develaping future policy and technological developments in e-learning. 

Finally, Andretta examines the competences associated with e-learning under the 
umbrella of information literacy, on the premise that elearning needs to be 
underpinned by information literacy skills to foster independent learning, predispose 
the students towards a lifelong-learning attitude, and equip them to deal effectively 
with information overload. Implications of implementing an information literacy policy 
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are explored from the point of view of the information literacy tutor or educator 
working within the higher education environment. The UK perspective on e-learning is 
also presented and compared to other national approaches. Andretta asserts that the 
British Government aims to foster the type of lifelong learning skills associated with 
the information literacy approach. However, a comparison between the UK e-learning 
model and the information literacy education initiatives in other English-speaking 
countries illustrates a need for the UK to implement a more cohesive information 
literacy policy. 
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CIBER, School of Library, Archive and Information Studies, University College London 
(peter.wilkams@ucl.ac.uk) 

Susannah Quinsee 

E-Learning Unit, City University, London (s.quinsee@city.ac.uk) 


Reference 
Browne, T. and Jenkins, M. (2003), “VLE surveys a longitudinal perspective between March 2001 
‘ and March 2003 for higher education in the United Kingdom”, UCISA, available at: www. 
ucisa.ac.uk/group/tlig/vle/vle2003.pdf (accessed 25 January 2004). 











The Emerald Research Register for this journal is available at FA The current issue and full text archive of this journal is available at 
www.emeraldinsight.com/researchregister th www.emeraldinsigkt.com/0001-253X.htm 











E-learning: what the literature = "ne. 
tells us about distance education 


An overview 


Pete Williams and David Nicholas e (ee 
CIBER, School of Library, Archive and Information Studies, University College Received 
London, London, UK, and Revised 12 July 2004 


Barrie Gun ter Accepted 28 August 2004 
Department of Journalism Studies, University of Sheffield, Sheffield, UK 


Abstract 

Purpose — The CIBER group at University College London are currently evaluating a distance 
education initiative funded by the Department of Health, providing in-service training to NHS staff via 
DiTV and satellite to PC systems. This paper aims to provide the context for the project by outlining a 
short history of distance education, describing the media used in providing remote education, and to 
review research literature on achievement, attitude, barriers to learning and learner characteristics. 
Design/methodology/approach — Literature review, with particular, although not exclusive, 
emphasis on health. 


Findings — The literature shows little difference in achievement between distance and traditional 
learners, although using a variety of media, both to deliver pedagogic material and to facilitate 
communication, does seem to enhance learning. Similarly, attitudinal studies appear tc: show that the 
greater number of channels offered, the more positive students are about their experiences. With 
regard to barriers to completing courses, the main problems appear to be family or wark obligations. 
Research limitations/implications — The research work this review seeks to consider is 
examining “on-demand” showing of filmed lectures via a DiTV system. The literature on DiTV 
applications research, however, is dominated by studies of simultaneous viewing by on-site and 
remote students, rather than “on-demand”. 

Practical implications — Current research being carried out by the authors shouid enhance the 
findings accrued by the literature, by exploring the impact of “on-demand” video material, delivered 
by DiTV ~ something no previous research appears to have examined. 

Originality/value — Discusses different electronic systems and their exploitation for distance 
education, and cross-references these with several aspacts evaluated in the literature: achievement, 
attitude, barriers to take-up or success, to provide a holistic picture hitherto missing from the 
literature. 


Keywords Distance learning, Attitudes, Communication technologies 
Paper type Literature review 





Introduction emerald 
The Department of Health (DoH) has commissioned University College’s Digital Health 

Research Unit (part of the CIBER group) and the University of Sheffield to evaluate a Asib proceedings: New Information 
pilot digital interactive television (DiTV) learning initiative “NHS Learn”, which is Perspectives 
about to be broadcast by Aston Media, to learners at various learning centres across ee ae ee 
the country. NHS Learn is the fifth DITV pilot commissioned by the DoH and evaluated © Emerald Group ee 
by the UCL-Sheffield team (see, e.g. Nicholas et al., 2002a, b; Huntington et al, 2002; por 10.1108/00012530510589083 
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Williams et al, 2003; Gunter et al, 2001). The learners are either current NHS staff, 
undertaking continuing professional development, or prospective staff. Learning 
centres are a mix of hospital trust locations and academic institutions. In addition, a 
number of training/learning organisations open to the general public have expressed 
an interest in offering opportunities to individual learners. These include a number of 
libraries and organisations working under the banner of “The learning exchange”, who 
offer public access to learning from various locations in the Birmingham area. 

The content of the educational package consists principally of the video recording of 
health and medical lectures and seminars for viewing either via PCs with a satellite 
link decoder, or DiTV. Many recordings are supplemented by course materials 
available electronically at the study centres. Aston are funded to produce the filmed 
material and do not take any payment either from the course providers, for having 
filmed the courses, or from the learning centres where the courses are run. The latter, 
which include, but are not exclusively, the course providers, are all linked to Primary 
Care Trusts (PCTs), Teaching PCTs and one Acute Trust, and will, if the project 
becomes a regular service, be the main clients for NHS Learn. Some of the courses are 
accessible from students’ workplaces and others at academic institutions which 
already run nursing/medical courses. 

Transmission times are agreed with the course purchasers. One important aspect of 
the service is that the broadcasting times are not the times at which students view the 
programmes. They are stored locally and available for a period of approximately two 
weeks to allow for on-demand access. Learning centres can decide their own schedules, 
either allowing individual viewing or time-tabling in group sessions at specific times. 
The courses are of varied length, from one hour (e.g. “Introduction to risk assessment”) 
to 24 hours (e.g. “Healthcare, ethics and law”) and cover a wide range of topics and 
levels. Many of the courses are suitable, for example, to ancillary workers. Course 
assessment ranges from confirmation that a programme has been viewed and 
materials read, to a formal test or practical examination. At present, one course will go 
towards a Masters degree, and others form part of the NVQ award. 

The project presents an opportunity for the UCL/Sheffield team, to evaluate the 
efficacy of one form of distance learning as an important development in the drive to 
provide training for NHS staff. In this regard, it will have important implications for 
future strategy and planning within the NHS for staff development and training. The 
project aims to provide comprehensive evaluative feedback from students attending 
the NHS Learn courses at all the sites to which they are being delivered. Continuing in 
the same vein as the earlier evaluation of the first four pilot DiTV health services, this 
project is exploring the potential benefits and costs of digital television as a delivery 
platform for training NHS personnel. The research will also make recommendations 
about how remote provision of training utilising television/video and Web support can 
most fruitfully evolve across the NHS. There may also be implications and lessons for 
such'training applications for other government departments and agencies that have 
pressing training needs. 

This paper attempts to provide the context for the project in terms of a short history 
of distance education, followed by an account of the media used in providing remote 
education currently, and research that has been undertaken, on achievement, attitude, 
barriers and learner characteristics. 








Theories of distance education 

Before discussing the history of “distance education”, or current research, it is, of 
course, necessary to define exactly what is meant-by the term. MclIsaac and 
Gunawardena (1996) summarise the characteristics of distance education from their 
own review of the literature as: education imparted where the learner is physically 
separated from the teacher (Rumble, 1986); as a planned and guided learning 
experience (Holmberg, 1986, 1989); and consists of a two-way structure distinct from 
traditional classroom instruction (Keegan, 1988). Many writers have looked at the 
higher level of independence or “learner control” (Holmberg, 1995) which is a feature of 
distance education. Baynton (1992) developed a model to examine this concept in terms 
of independence, competence and support. She notes that “control” is more than 
“independence”. It was also affected by competence (ability and skill), and support 
(both human and material). 

Another concept, that of “transactional distance”, was advanced by Michael Moore 
(1990). Here, “distance” is determined by the amount of communication or interaction 
which occurs between learner and instructor, and the amount of structure which exists 
in the design of the course. Greater transactional distance occurs when a course has 
more structure and less communication (or interaction). A continuum of transactions 
might exist in this model, from less distant, where there is greater interaction and less 
structure, to more distant where there may be less interaction and more structure. 
There is, these days, the problem of conflating of distance learning with e-learning. It 
could be argued that e-learning provides such a high level of interaction that the 
“distance” is necessarily smaller. 


History of distance education 

Distance education programmes date from the nineteenth century (Nasseh, 1999; 
MclIsaac and Gunawardena, 1996), although it has been suggested that even St Paul 
spreading the Gospel, with his letters to early church groups, was a form of distance 
education (Demiray and Isman, 1999). This was on the basis that his “students” were 
remote and widely distributed and that his letters were a form of education. The first 
type of formal distance education course, in the nineteenth century, was also, of course, 
in the form of the written word. Issac Pitman, regarded as the first modern distance 
educator, began teaching shorthand by correspondence from the English City of Bath 
in 1840. The University of London founded its correspondence college at around this 
time, and other private correspondence colleges began in the late 1880s (Levenburg, 
n.d.). In the USA, correspondence courses had also taken off (Watkins and Wright, 
1991), and by 1910 International Correspondence Schools in the USA already had 
around 184,000 students (Glatter and Wedell, 1971). 

Newer technologies have been used since the start of the twentieth century. 
Instruction films appeared in 1910 (Reiser, 1987) and the State University of Iowa 
began experimenting with transmitting instructional courses as early as 1932, seven 
years before television was introduced at the New York World’s Fair (Jeffries, n.d.). By 
1939 the university had broadcast almost 400 programmes (Moore and Kearsley, 1996). 
Wisconsin’s “School of the air”, another example, was broadcasting ten programmes 
per week to campuses in the 1930s, and continued on-air until the 1970s (Bianchi, 2002). 
Meanwhile, radio was also being exploited. In the mid-1920s, the Department of 
Education in the UK began to provide schools with radio based instruction, and soon 
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10,000 schools were using BBC radio programmes to support classroom teachers 
(Demiray and Isman, 1999). 

Television and, especially, radio were used to a greater degree after the war, though 
not, according to Cambre (1991), with too much success, owing to the unimaginitive 
way in which lectures were filmed and presented. The University of Wisconsin, 
however again at the forefront of progress created the Articulated Instructional Media 
project (AIM), which attempted to be a complete system of distance education, 
including broadcast media, correspondence and telephone (Cook, 2000). In the UK, the 
Labour Government also looked to television to provide distance learning, and 
approved the setting up of the so-called “University of the air’, renamed the Open 
University (OU), based in Milton Keynes. It has become the UK’s largest university, 
with over 200,000 students (OU, 2003a). The Open University model has been adopted 
by many countries in both the developed and developing world (Keegan, 1986). 

In the mid-1970s, satellites began to be used for television broadcasting and the idea 
of teleconferences began to emerge (Moore and Kearsley, 1996). Audio and video 
recordings, teleconferencing and interactive telecommunication increased rapidly 
throughout the 1980s (Moore, 1990). Personal computers enabled what has become 
known as “multimedia” applications to be developed and widely used. CD-ROMS 
enable large amounts of audio, images and moving pictures to be distributed to 
students at a reasonable price (Moore and Kearsley, 1996) and, latterly, the internet has 
become a central medium to facilitate remote learning. It has been embraced by major 
distance education providers such as the OU which claimed in 2003 that more than 
180,000 students interacted with the OU online from home, and that “every week more 
than 30,000 students view their academic records online” (OU, 2003b). This apparently 
peaked at 65,000 users in the week that exam results were available. 


Current media used in distance education 
The previous section showed how distance education has grown from an activity 
involving written communication only, to utilising TV and radio technology, to today’s 
vast array of platforms, formats and delivery mechanisms. These can be summarised 
as described below, starting with digital interactive television, as this is the principal 
medium being investigated in the present study. 


Digital interactive television (DiTV) 

Broadcast television continues to be an important and widely used medium for the 
delivery of distance education, and has been the subject of much research, as described 
later in this paper. However, Digital TV is fast becoming a mass medium in the UK. 
Satellite/digital/cable TV ownership has also increased dramatically over the past few 
years. A clear majority of UK homes now receive digital services Higham, 2003). Some 
28.2 million live in multi-channel homes against 27.2 million who still receive analogue 
services, The advantage of digital TV over the traditional analogue system is the better 
image quality and enhanced capacity (Niiranen et al, 2002, p. 250), so that TV 
companies can provide additional personalised interactive services such as web access, 
banking and e-mail (Love and Banks, 2001). It is often overlooked in discussions on 
e-learning that learning via TV may be equally viable and significant, and encourages 
widening participation in a more-effective way, perhaps, than e-learning. 








The present authors have been much involved in the evaluation of pilot DITV 
services for the general public, or “health consumers” (see, e.g. Nicholas et al, 2003; 
Huntington et al., 2003; Williams et al., 2003; Gunter et al, 2001). Results suggested that 
the medium was well received by users with few usability problems being reported. 
Some services were not substantially used, but this may have been because of the small 
availability period. Also, tentative suggestions were made that different types of 
information might be more effectively disseminated by different media or platforms. 
For example, patient experiences were found to be ideally transmitted using video, but 
information concerning medicine dosages and effects may be presented more clearly as 
text, possibly in tabular form. 


Video-conferencing 

Video-conferencing is generally two-way and carries audio and video information, so 
that people at two or more sites can see and hear each other. Many medical studies 
have been undertaken using video-conferencing facilities. Brunk (2002), for example, 
describes an initiative to provide nutrition counselling for elderly people in Nevada. 
Similarly, Swindell and Mayhew (1996) provided 18 housebound elderly people with an 
eight-week tele-conference offering practical information in nutrition, health and social 
services. Education for health professionals has also been offered via this medium. 
Andrusyszn et al (2000) used a video-conferencing facility and asynchronous 
computer conferencing to enhance learning and promote international collaboration 
among graduate nursing students. 


Audio-conferencing 

Audio-teleconferencing may be described as two-way voice communication using 
standard telephone type technology (Kirby and Boak, 1987). While not as sophisticated 
as video-conferencing, audio-conferencing also facilitates interaction. Research into the 
use of audio-conferencing is rare. In one of the few studies to date, Cragg (1991) 
examined the experience, learning strategies, and reported learning of nurses taking a 
course either by audio-teleconference or correspondence. She found that the 
teleconferences encouraged group learning; although correspondence was more 
convenient, 


World wide web/internet 

The world wide web is becoming ever more exploited in education. According to Olson 
and Wisher (2002), Web-based education offers learners “unparalleled access to 
instructional resources, far surpassing the reach of the traditional classroom”. It also 
makes interaction possible to a much greater extent than traditional distance education 
(Newman and Scurry, 2001). The use of the web in learning is not problem-free, 
however. Pajo (2001) identified a number of barriers to uptake of web-technology by 
university staff. Chief among these were the time required in learning how to use 
web-based technology and develop appropriate courses, the lack of training, and 
monitoring web-based teaching. 


Video/audio tapes 
Audio cassettes are convenient because of their portability and because they can be 
used privately on headphones. This medium is used to a large extent in language 
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training, where sound is of particular importance. One of the few studies of audio 
pedagogy was that by Beare (1989), who compared the effectiveness of six 
instructional formats which allowed differing levels of interaction, including audio 
assisted independent study. Results showed that neither individual instructional 
formats nor the degree of interaction had much effect on student achievement. Distant 
learners including those in the audio group found the course just as stimulating, were 
equally interested in the subject matter, and judged the instructor or narrator equally 
as skilled as did those receiving face-to-face instruction.Video instruction became 
popular during the 1980s when the price of video-recorders fell and they became a 
common feature in the home. Surprisingly, no work appears to have been undertaken 
on the use of video in terms of its use as an “on-demand” medium (i.e. on the benefits of 
instant replaying of material etc.) despite the fact that much learning — in particular, to 
learn a foreign language, takes place using this medium. More typically, Paulsen et al. 
(1998) compared student achievement and satisfaction with regard to course delivery 
via DiTV, broadcast TV and videotape, but without examining how the media were 
manipulated. Student achievement was not significantly different in any of the groups. 


Telephone/Fax 

The telephone is, of course, generally used for one-to-one contact, and forms only a 
minor part in distance education. Hobbs ef al. (2000), cited in Finger and Rotolo (2001), 
examined the replacement of a radio service with telephone for on-air lessons at a 
distance education school in Queeensland, Australia. The researchers found many 
benefits of the telephone over radio, including greater understanding of learning tasks, 
increased motivation, more participation, improved enjoyment, and a greater range of 
teaching strategies being utilised. 


CD-ROM 

CD-ROMs allow multimedia to be captured on to a laser disc and used with personal 
computers. Little research appears, surprisingly, to have looked at CD-ROM mediated 
learning. In one of the rare studies to have looked at this mediaum, Oviatt et al. (2000) 
found that the use of a CD-ROM with students undertaking a course in trans-national 
management was not associated with better examination performance. The rise of the 
Internet has made the use of CD-ROM somewhat dated. 


Overview of distance education research 

Distance education research, as can be discerned from the brief citations above, 
encompasses a huge range of issues, as the references to research already outlined in 
this paper testify. Those that are of particular relevance to the current project are: 


* achievement/outcomes; 

* attitudes/opinions (e.g. level of satisfaction etc.); and 

e accessibility/barriers (both to course participation and completion and delivery 
type). 


These are discussed in turn below. 








Achievement/outcomes 
As may be expected, a huge amount of research has gone into various aspects of 
distance education in terms of student achievement and outcomes. A useful starting 
point-for a brief review is to look at results from meta-analyses. Such studies indicate 
little difference in achievement between traditional face-to-face and distance learning. 
Indeed, this finding was being reported as far back as the early 1960s, with particular 
regard to the use of television (e.g. Schramm, 1962). Dubin and Hedley’s (1969) review 
of studies also found no significant difference between television and face-to-face 
instruction. Later meta-studies also tended to conclude that there was little difference 
(e.g. Cohen et al., 1981; Moore et al, 1990), although the latter cautioned that much of 
the published literature was either anecdotal or employed weak research designs. 
However, even the latest meta-studies (e.g. Machtmes and Asher, 2000) continue to 
show little difference in achievement between distance and traditional learners. 
Navarro and Shoemaker (2000) contend that much distance education literature is 
based on “older” learning technologies, such as television. By contrast, they say, there 
are few studies that “rigorously compare distance learning in the newer, multimedia 
cyber-learning format with traditional learning” (Navarro and Shoemaker, 2000, p. 17). 
They attempted to rectify this with a study that looked at both performance and 
perception of traditional versus “cyberlearners”. The latter were provided with lectures 
on CD-ROM, together with online quizzes, an electronic bulletin board (asynchronous 
communication), a “discussion room” (synchronous chat), and e-mail access to the 
course tutor. The traditional group were provided with face-to-face lectures, 
discussions and a standard textbook. Performance of the two groups was rated by 
comparison of final examination scores, and attitudinal measures (described later). 
Results showed that the cyberlearners performed significantly better, by gender, 
ethnicity or class level, than the traditional group. 


Attitudes/opinions (e.g. level of satisfaction) 

Again, a good starting point here is to look at meta-analyses. Allen et al (2002), looked 
at studies comparing student satisfaction with distance education to traditional 
classrooms in higher education. Results indicate that: “students indicate a slightly 
higher level of satisfaction with live course setting than distance education formats” 
(Allen et al, 2002, p. 89). The effects of communication channel were examined, which 
showed a preference for video to written formats. The authors point out that this is 
consistent with the hypothesis that greater information, including the ability to see the 
instructor, is preferred over more limited channels. Interaction was also examined. Not 
surprisingly, “full interactive audio/visual demonstrated the largest effect” (Allen et al, 
2002, p. 91). In sum, the authors conclude that students compare distance education 
favourably to other educational formats. 

Navarro and Shoemaker’s (2000) “cyber-learners” versus face-to-face student study 
has been outlined above, with regard to student performance. The authors also looked 
at student attitudes with regard to the course presentation. For this, an attitudinal 
survey was undertaken. One part of this concerning workload, reasons for taking the 
course, quality etc., was given to all students. A second part, however, was given to the 
cyber-learning group only. This focused on the evaluation of the technologies involved 
(CD-ROM, online bulletin board, etc.) Results showed that a desire to learn at one’s own 
pace (28 per cent), and to not have to attend lectures (20 per cent) were significant 
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factors. Of those who chose the traditional course delivery mode 49 per cent indicated 
that they felt more comfortable in the familiar environment, 20 per cent felt they would 
not learn as much online, and 15 per cent were not aware that they had had the choice 
of options. 

Much distance education research into perceptions, attitudes, etc. does not compare 
distance to face-to-face courses, or look at the interplay between online and offline 
environments, but instead examines the distance environment in itself. One such study 
is that undertaken by Daugherty and Funke (1998). The researchers surveyed staff and 
students involved in a web-based Masters degree course in education, which involved 
both using search engines to find information, and accessing a number of given 
health-related web sites. The course was well received, with the most cited benefit 
being the vast information store housed on the web. Interestingly, however, it was the 
technology-related knowledge, rather than subject-related, that were rated most highly 
(Le. learning to navigate the web, using listservs etc.). Apart from some problems, such 
as perceived lack of staff support, and some student resistance, overall the course was 
considered a great success. 

These positive findings were not mirrored by a study carried out by one of the 
present authors (Williams, 2001, 2002). Undergraduate psychology students were 
required to use web-CT for course notes, learning exercises and online discussions. 
In-depth interviews with both students and lecturers showed that the two groups 
differed markedly in their perception and evaluation of the system. Many of the 
advantages trumpeted by the former were dismissed by learners, who felt that online 
material gave them extra work, represented an abrogation of academics’ teaching 
duties (i.e. by simply posting reading materials online without explaining it) and 
shifted printing costs from the institution to the student. The study concluded that 
more attention needs to be paid to user needs, from their own perspectives, user 
attitudes towards information provision, and to then tailoring material to take these 
factors into account. 

Unlike the studies outlined above, Thurmond et al (2002) attempted to evaluate 
student satisfaction with a web-based distance education course whilst controlling for 
student characteristics. The authors argue that although previous work has examined 
student satisfaction with web-based distance learning (e.g. Billings et al, 2001) it is 
difficult to link student perceptions with purely environmental variables. In other 
words, it may be that, as the authors put it, “students ... were more satisfied with 
web-based courses because of their computer skills or high level of knowledge 
regarding course content rather than as a result of ... the web-based course” (Billings 
et al, 2001, p. 86). In fact, results indicated that student characteristics did not influence 
reported levels of satisfaction. 


Accessibility/barriers 

A very important aspect of distance education is, of course, that of accessibility. Much 
of the work looking at this issue has approached it by studying the characteristics of 
students who either fail to enrol for or complete a particular course. As Powell et al. 
(1990) say: 


Questions related to why some students succeed and others fail ... are of both theoretical and 
practical importance, as distance education moves from a marginal to an integral role in 
overall educational provision (Powell et al., 1990, p. 5). 











Typical of papers on this subject is that by Siquera de Freitas and Lynch (1986), who 
investigated drop-out rates at a remote university introductory course. Unsuccessful 
students tended to be older, be less likely to use the resources available, devote less 
time to the course, worked or had other study distractions, and found the materials 
more difficult to use. 

Powell et al. (1990) went further than this kind of analysis, developing a conceptual 
framework of student success and persistence in distance education which 
concentrates on predisposing characteristics on student success. Powell classified 
the factors contributing to success and retention in distance education into three 
general categories. These are: 


(1) Predisposing characteristics: including prior education, socio-economic and 
demographic status, and motivational and other personal attributes. 


(2) Life changes: such as personal illness, relocation, altered employment status, 
and family problems. 


(3) Institutional: including quality and difficulty of instructional materials, access 
to and quality of tutorial support, and the administrative and other support 
service provided. 


In their study of drop-outs from a Hellenic Open University course in education studies, 
Vergidis and Panagiotakopoulos (2002) found that the main problems stemmed from 
family or work obligations, rather than from factors intrinsic to the course or its 
delivery. It has long been known that such external factors were extremely important. 
Knox’s (1977) developmental-stage orientation of adult life stresses the importance of 
understanding the context within which a person carries out their everyday activities, 
i.e. their family, work, health, condition, personality etc. These all affect adults’ ability 
and willingness to participate in adult education. No single factor appears to cause 
non-participation; however, individual student characteristics and life circumstances 
appear to have the greatest impact on participation (Kerka, 1986). Other studies (Carr 
et al, 1996; Goodman et al, 1990; Lazin and Neumann, 1991) all indicate that 
demographic variables were less predictive of completing an educational programme 
than attitude and the degree of social support received. 

Some work has been carried out with regard to barriers that prevent full 
participation in online courses, even for students who complete them. Howard (2002), 
for example, identified several barriers with regard to online interaction, the principle 
one of which was an “insurmountable social-psychological barrier”. Technical 
problems were also blamed for a lack of interaction, with the sound often of poor 
quality and difficulties in manipulating cameras and microphones. Howard also noted 
a certain degree of alienation, brought about by the lack of physical presence and the 
reluctance to use the technology. The latter finding reflects earlier research by, for 
example, Comeaux (1995) and McHenry and Bozik (1995), which indicated low levels of 
interactivity often resulting from technological problems. 


Conclusion 
This paper has outlined the history of distance education and the way in which 
information technology has been used to support the practice. Research literature has 
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offered a number of insights into different aspects of the topic. Most research shows 
littlé difference in achievement between distance and traditional learners (e.g. 
Machtmes and Asher, 2000), although using a variety of media, both to deliver 
pedagogic material and to allow effective communication between learners and tutors 
does seem to enhance learning to the extent that distance learners can out-perform 
face-to-face colleagues (e.g. Navarro and Shoemaker, 2000). Similarly, attitudinal 
studies appear to show that the greater number of channels offered, the more positive 
students are about their experiences (e.g. Allen et al, 2002). 

With regard to barriers to completing courses, the main problems appear to be 
family or work obligations, rather than from factors intrinsic to the course or its 
delivery (Vergidis and Panagiotakopoulos, 2002). Many studies (Carr et al, 1996; 
Kerka, 1986; Lazin and Neumann, 1991) indicate that demographic variables are less 
predictive of completing an educational programme than life circumstances, attitude 
and the degree of social support received. Barriers have also been identified to the full 
participation of students, such as technical problems (Williams, 2002; Comeaux, 1995) 
or ability and social factors inhibiting interaction (Howard, 2002). 

The research being carried out by UCL/Sheffield universities should enhance these 
findings, by exploring the impact of “on-demand” video material, delivered by DiTV 
something no previous research has, apparently, examined. It might be particularly 
needed in the current climate where distance education is often subsumed into 
e-learning and this is seen as the panacea for everything, whereas actually these other 
mechanisms may be as viable. 
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Abstract 

Purpose ~ In an age of information technology some developing countries are more vulnerable than 
others to international competition through failure to utilize fully the benefits of an ICT culture. The 
authors suggest that the strategies in response must include a radical review of attitudes and methods 
of delivery of ICT in schools and give as an example the recent experience of The City School, the 
nationwide schools’ organisation in Pakistan with whom they are employed. 
Design/methodology/approach — The authors review the current position of ICT in schools in 
Pakistan and suggest as a model of development, for schools of a corresponding standing, that of The 
City School. They describe how The City School responded to ICT, ensuring that all its pupils would 
have access to the most modern of ICT courses. Discusses how the decision to implement a complete 
change, or revolution, in teaching ICT was brought about in a relatively short time. It discusses the 
nature of the programme, how it was organised, the materials required and the outcomes of its 
implementation including its outstanding success with pupils and their parents. 

Findings — The authors chronicle the historical developments within The City School that brought 
about radical change within a comparatively short periad and identify careful planning, training, and 
the motivation of stakeholders, i.e. pupils, teachers and parents, as key elements in its successful 
implementation. 

Originality/value — The authors suggest that The City School experience provides a model that 
may be emulated by schools elsewhere in both developing and industrially developed countries. 
Keywords Communication technologies, Pakistan, Design and development, Schools 


Paper type Case study 


Introduction 

The Economist Intelligence Unit, in May 2004, in its. report on the comparative 
prevalence of e-commerce worldwide, listed the UK as second to Denmark while, 
perhaps inevitably, Pakistan came very close to the bottom of the 64 country league 
table (The Economist, 2004). Initiatives in Pakistan are taking place the government is 
introducing e-government at a significant pace, for example (Sun Microsystems, 2004). 
In the country’s advanced industrial sector, despite initial hesitation due to economic acti proceedings: New Information 
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constraints, corporate industry is modernising in this area (Mansoor, 2004). The future Wee 
expansion and use of information and communications technology (ICT) in Pakistan, pp. 123.130 
as elsewhere, must inevitably, however, depend on the school and higher education © Emerald Group Publishing Limited 
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country be confident of making the best use of the ICT opportunities now available 
within Pakistan. In the government schools, ICT courses remain limited to the 
relatively small secondary sector. In the private schools, to which it is estimated that 27 
per cent of the school-going population belong, an effort to incorporate appropriate 
courses pertains: “Computer studies” features on the timetables of most private schools 
for the middle and above income groups. These schools usually offer pupils, in the 
older classes, the international GCE O-level Computer Studies course as an optional 
subject. In June 2002 there were 518 entries for this increasingly popular Cambridge 
O-level exam; the following year entries from schools throughout Pakistan had 
increased to 1,467 (University of Cambridge Local Examinations Syndicate, 2003). This 
article describes how The City School, a leading private schools’ organisation in 
Pakistan, responded to the challenge of introducing ICT from the late 1980s until the 
present time. 


The City School 

The City School is the generic title for one of the largest private school systems in 
Pakistan. It has an actively involved owner, Dr Farzana Firoz, a businesswoman and 
educationalist at its head as managing director. There are, currently, 129 schools 
situated in 35 cities widely distributed throughout the country. It requires nearly three 
hours flying time in this longitudinal, largely mountainous and desert Indus 
river-basin country, plus a similar number of hours’ driving time, to travel from the 
southernmost schools to the most northerly. From west to east, distances are measured 
in driving time there are few helpful air routes — the furthest schools requiring a full 
day’s travel from their Regional Office in Karachi, Larkana, Multan, Lahore or 
Islamabad. 

There are some 35,000 pupils in schools of varying sizes the largest is of nearly 
3,000 pupils in the case of one modern campus, to perhaps less than 100 in the case of 
newly opened schools housed, for the time being, in adapted residential buildings. 
Thus the introduction of an ICT-culture in The City School, as a whole, presented 
unusual challenges in terms of the distances and logistics of the schools’ organisation 
for example, merely getting the ICT teachers together for meetings is an expensive 
undertaking, although meetings are now arranged at least twice during the year. 
However, throughout the network of schools there is a high sense of motivation and a 
strong sense of unity of purpose. The name, The City School, in the singular, was 
chosen to suggest one school children entering a City School in any part of Pakistan 
enter a family of schools, each of which has the same values and objectives hence 
throughout all schools, there is that important feeling of belonging to one school. The 
school system, indeed, contributes to the building of national unity. 


The City School curriculum 

Curriculum committees centrally plan The City School syllabi. The committees, made 
up of the most experienced teachers in Karachi, Lahore and Islamabad, and with 
formal training in syllabus development, are formed for each subject. The development 
of the syllabi depends to a large extent on the feedback, organised systematically, from 
the teachers in the classrooms throughout the country. The curriculum of The City 
School generally corresponds with that of the UK curriculum, although, of course, it 
includes Islamic studies and Urdu as significant parts of the curriculum. The City 





School has always, too, placed great importance on the teaching and learning of science The City School, 


in its primary schools even before, in the UK, the National Curriculum emphasised the 
importance of this area for young children and increased, in many UK primary schools, 
the time spent on this subject. 

In the humanities, teaching and learning, where possible, reflects the history and 
culture of the subcontinent. Pakistan Studies is introduced as a valued subject, 
ultimately taken at O-level, for pupils aged 13 years and above. More than 2,000 of the 
School’s pupils appear every year in the GCE Ordinary and Advanced Level 
examinations provided by Cambridge International Examinations, the overseas’ school 
examinations department of Cambridge University. The O-level, in Pakistan, is 
preferred to the alternative International General Certificate of Education (IGCSE) on 
the grounds of its lower cost and, for a developing education system, its means of 
assessment by formal examination only. 


Introducing ICT in The City School 

The fast moving developments in ICT and its rapid introduction into the world of 
industry and business made an enormous demand on schools. They were suddenly 
morally and professionally bound to impart sound ICT education to their pupils and to 
a level whereby ICT becomes not only a “subject”, but also an effective tool for the 
study of all other subjects. It was necessary, therefore, for The City School pupils, and 
for all pupils in Pakistan, to be introduced to the teaching and learning of computer 
skills. This was an enormous financial hurdle for the impoverished government sector 
one that it has not yet crossed to a significant extent but for the wealthier private 
schools catering to the middle-income and above groups, the hurdles lay more in 
developing the vision and imagination necessary to cope with a whole new dimension 
in the lives of children and the adults they will become. In The City School the vision 
was present. Implementation, once the decision was made to introduce an ICT-rich 
culture, was the next step. 

The City School’s management endeavoured to include ICT in its curriculum in its 
schools from the earliest days. Its initial policy was to introduce ICT in the classes for 
children of 11 to 16 years. From the mid-1980s, each school for this age group had its 
“Computer Room” to which each class had access for one or two periods each week. For 
the youngest children, from five to 11 years, it seemed more important to ensure a high 
standard of attainment in the basic subjects, including English, Urdu, and 
mathematics, in a timetable containing, as noted above, subjects additional to all the 
other subjects of a modern primary school curriculum. The timetable, in other words, 
was already full. Therefore, until September 2002, in deference to the perceived 
priorities at that time, The City School’s curriculum development team provided a 
programme of study in ICT only for pupils of classes VI to VII (i.e. 11 to 13 years), after 
which the pupils who chose to continue computer studies did so with Cambridge “O” 
and A-level courses[1]. 

In due course there arose dissatisfaction with the teaching methodology and an ICT 
curriculum, prior to the O-level course, which did not keep pace with the rapid changes 
in the subject. The managing director, in addition, was keen to provide an ICT 
education to students of ever-younger ages, in keeping with developing international 
practice. It was determined, therefore, that The City School’s aims are to: 
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* develop ICT capabilities systematically in children from a young age; 


e provide an up-to-date ICT education by making available the latest technology; 
and 


* use ICT to enhance learning in all subjects. 


NCC Education and the Computer Pioneers programme 
To find a course that would achieve or lead towards the achievement of these aims, it 
became necessary to explore a wide range of commercially produced ICT education 
programmes provided by companies from Singapore and South Asia to Europe. 
Ultimately, after analysis and examination, the UK’s National Computer Centre (NCC) 
Education’s Computer Pioneers programme a system of teaching the UK’s national 
curriculum ICT component was selected as the one which best met the Schools’ needs. 
NCC Education, a leading, independent ICT qualifications’ awarding body with 
training centres around the world, provides course books, teacher training, monitoring 
of courses and the moderation of assessments for the annual certificate awards in other 
words, a complete package for the starting school or, in this case, 129 schools[2]. 
The Computer Pioneers programme is a series of courses in ICT for children aged 
four to 16 years that enable them to become confident in the use of ICT from an early 
age..For pupils aged four to 14, there are ten courses based on attainment levels within 
the UK national curriculum. For pupils aged 14 to 16, there are courses based on 
preparing candidates for GCSE, IGCSE and O-level examinations. On the completion of ` 
each level, pupils are assessed by NCC Education’s international team and are awarded 
internationally recognised certificates every year. The teaching methods include 
theory; workbook based learning, practical activities, educational software at the 
primary level, and application software. 


The launching of Computer Pioneers in The City School 
The City School negotiated and finalised an agreement with NCC Education in early 
September 2002, based on the accredited partner system and on the number of pupils 
The City School would guarantee to provide annually. The programme was launched 
immediately that is, after the start of the first term of the school year, which begins, in 
The City School, in mid-August[3]. It was an enormous challenge as the deadline to 
begin the programme, 7 October 2002, entailed introducing it simultaneously to, 
initially, some 8,000 pupils in 58 schools, geographically widespread in all the 
provinces of Pakistan, including Jammu and Kashmir. This necessitated the following 
requirements: 
(1) The importation, at short notice, of sets of resource equipment (robots, data 
loggers, for example) and sets of original branded software from the UK. 
(2) Arrangements to be made for the teacher trainers to travel from the UK and Sri 
Lanka. 


(8) Arrangements for the rigorous training in three venues (Karachi, Lahore, and 
Islamabad) of a large number of computer teachers in the use of highly 
sophisticated resource equipment and an assortment of UK software. 





(4) The distribution of weekly work plans, soft versions of textbooks, teachers’ The City School, 


guides, activity books, files and posters. They were provided by NCC Education 
on CDs, and were to be printed, in good time and by The City School, for over 
8,000 pupils. 

(5) The hardware in each school was to be upgraded and the number of computers 
in the computer labs was to be increased. New computers with a high 
configuration were to be provided; black-and-white and colour printers were 
also necessary. A standing operation procedure was laid down establishing the 
parameters for the equipment and facilities to be provided in each lab. 


(6) The continuous assessment system, in conjunction with half-yearly 
internationally moderated examinations, was to be instituted. For continuous 
assessments, teachers were trained to conduct the activities required and to 
enter marks regularly on the record sheets. For the examinations, the 
examination papers were to be brought to Pakistan by an NCC Education 
representative, and the same examination paper was to be administered to 
pupils, throughout the country, on the same day. After the examination, 
teachers were to be grouped together at three venues (Lahore, Karachi and 
Islamabad) to mark the papers in the presence of the NCC Education’s 
moderation team. 


(7) Asystem of transmitting the registration of pupils to NCC Education was to be 
established. On the basis of this, each school was allocated a Centre ID and each 
student an NCC registration number. On the basis of the registrations, NCC 
Education was to be paid a candidate fee in UK sterling and each successful 
pupil awarded an international level certificate. 


(8) In each school where the programme was to be implemented an orientation 
meeting, using multimedia, was to be arranged for the parents of all pupils. The 
City School encourages parents to participate in the important 
parent-teacher-pupil relationship triangle. In this case it involved introducing 
to them, in addition to their children, a new and sophisticated ICT course of 
studies. The meetings went very well and the parents were highly appreciative 
of the decision to launch the programme. 


On the organisational front, an organisational structure was put in place, the outline of 
which is given in Figure 1. 

All these activities went smoothly, for the reason that thorough planning and 
careful co-ordination were the keystones of the operational strategy. The primary mode 
of control and communication was by e-mail. Each school, regional office, the Head 
Office and the NCC Education Consultant were kept effectively in touch by this 
medium. All the instructions and soft versions of operational documents were 
exchanged through e-mail. Paper correspondence was minimal not more than 1 per 
cent of the total. The courier mail service was used very sparingly and only for items 
within Pakistan. The procedures at the school level were painstakingly monitored and 
guided through e-mail. The NCC Education’s resident consultant provided very useful 
support, since every teacher, school head or regional co-ordinator was able to consult 
her on academic matters through e-mail. 
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Figure 1. 
The City School: 
organisational chart 
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Developing the programme in subsequent years 

The first year went smoothly. Before the summer vacation in June 2003 planning and 
preparations were completed for the programme to be expanded to 12,000 pupils by the 
forthcoming August. The preparations involved the establishment of new computer 
labs, the training and retraining of teachers and the importation of new sets of software 
and resource equipment. In the same year technological innovations were. introduced 
which involved the automation of the registration process and students’ assessment. 
Examination records were put online. The system developed by NCC Education, 
Computer Pioneers, Online (CP Online), did face teething problems at the design and 
development stage, but these were resolved through a trial run in Pakistan, which 
provided effective feedback. 


Outcomes 

By the third year (2004) it was intended that a minimum of 15,000 students would take 
up Computer Pioneers by August 2004. However, the success of the programme was 
such that more than 17,000 students have enthusiastically taken it up. In 2005, it is 
expected that over 20,000 City School pupils will be taking these courses. A straw poll 
taken randomly in some of the schools in which the programme is operating produced, 
for this author, the same result: the courses are indeed popular and pupils do look 
forward to their next computer lesson. 

A further and most important extension of The City School policy with regard to 
ICT is that all heads and teachers of The City School undergo training in the basic 
skills of ICT. This is a countrywide programme organised by the Professional 
Development Department of The City School, currently based in its Head Office in 
Lahore. This year, 2005, the heads of schools and teachers will be offered an advanced 
course. It is intended that ICT be employed in the classrooms at a far more extensive 





and sophisticated level than it is at present teachers will assist more than previously in The City School, 


the learning process rather than be the traditional conveyors of knowledge. And the 
heads, in due course, will be enabled to advise teachers on the use of ICT as a teaching 
tool in the classrooms and to introduce into their schools ICT school and class 
management systems. Clearly, the heads, with a vital leadership role, are the agents of 
change without their support such developments in teaching methodology and school 
and class administration cannot succeed. 

Thus far, a very positive picture of a major change in the school curriculum has 
been portrayed. However, the disadvantages must include that of the reduction in the 
school timetables of time given to subjects previously thought worthy of extended 
time. The NCC Education courses require in our terminology three periods of 40 
minutes each for children aged up to 11 years and four periods of the same length for 
older pupils. Inevitably in an already busy timetable, the children lose time previously 
given to such subjects as music, art, PE and “library”, for which the time allocation is 
reduced. All other subjects have a fixed time allocation that their teachers find they are 
unable io reduce. The promotional literature of NCC Education stresses that their 
courses “are flexible and can fit into school class time, or may be run as after school 
classes for one or two hours a week for a year, with the exception of the Computer Ace 
courses (leading to A-level) which are usually taught over two years”. This was not 
possible in Pakistan where schools are constrained by the climate to close in the 
summer for at least ten weeks and to close at the end of a long school day at 
approximately 1.00 p.m. for the younger children and at 2.00 p.m. for those aged 11 and 
above. To make full use of the programmes, therefore, it was necessary to fit them into 
the regular timetables. Thus, a fundamental shift in priorities has taken place what is 
removed from the timetables has been removed very reluctantly. 

Of the positive aspects of the courses, it may be said that they maintain the interest 
of children and ensure that they are provided with a sense of involvement and 
achievement. Younger pupils use interactive software designed for children to learn a 
range of ICT skills. The courses integrate, effectively, subject teaching in a variety of 
subjects including, for example, geography, science, and mathematics. Older pupils use 
standard computer application packages and languages to develop more specific skills 
and abilities. Ultimately, all our pupils are expected to be completely confident with 
ICT terminology and programming. 

Undoubtedly, a quiet revolution is taking place and one that will ensure that pupils 
leaving The City School will do so with a sound foundation in ICT. 


Notes 
1. A certain anomaly exists in Pakistan whereby “Cambridge” schools, as they are known, take 
three or, perhaps, two-and-a-half years, to deliver the two-year O-level syllabi. The City 
School has now phased this system out and delivers the syllabi to the majority of the age 
group for whom they are designed. 

. The NCC was created by the UK Government in 1966 and privatised in the 1970s. According 
to NCC’s promotion material, they assess more than 250,000 people worldwide annually to 
NCC Education’s standards (see: www.nccedu.com). 

3. In the government school sector the school year runs from March to March. In some City 

Schools, where the National Matriculation Examination is taken, the same school year is 
adopted. 
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Tackling student referencing 
errors through an online tutorial 
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Department of Information and Communications, Manchester Metropolitan 
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Abstract 

Purpose — To evaluate the impact of an interactive online tutorial aiming to improve student citing 
and referencing practice. 

Design/methodology/approach — Action research involving three cycles of activity: identification 
of the most frequently occurring errors made by new undergraduates and postgraduates following 
instruction in citing and reference practice given in the autumn of 2002; creation of the tutorial for use 
by the same students in spring 2003, with the quizzes contributing to a portfolio assessment for the 
undergraduates. Comparison of the students’ performance before and after using the tutorial, 
monitoring through WebCT tracking facilities and usability tests with dyslexic students; adoption of 
the tutorial as the standard departmental practice, repeating the monitoring activities to compare the 
results with the previous year. 

Findings — The results of the first cycle of activity showed a high number of errors, despite the 
instruction received by students, and the need to start the tutorial at an unanticipated basic level. The 
students responded positively to the tutorial and some improvements in practice were identified, 
although the tracking facilities revealed limited use by some undergraduates. Comparison of the errors 
made in 2003-2004 with those of 2002-2003 showed improvements all round. 

Research limitations/implications — Some of the improvements may be accounted for by the 
change of practice part way through the previous academic year and other interventions. 
Originality/value — The methods used will inform others wishing to carry out and evaluate online 
learning initiatives. It shows a qualified success in the use of online learning for this purpose. 
Keywords Computer based learning, Interactive terminals, Project evaluation, Referencing 


Paper type Case study 


Introduction 

Many students experience difficulties with citing and referencing bibliographic 
sources, The increasing range of electronic sources makes the task more complex 
(Stein, 1999). In the future, as more students come from non-traditional backgrounds, 
there is likely to be a greater need to provide clear explanations and opportunities for 
practice before assessment. The aims of this project were: 

(1) To identify frequently occurring errors in the references provided by first year 
undergraduate and postgraduate students in the Department of Information 
and Communications. 

(2) To design a WebCT[1] tutorial which would: 

* be based on student needs identified through the analysis of errors; 
* make learning more fun by being interactive; and 
* be accessible for disabled students. 
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(3) To evaluate the impact of student use of the tutorial during the academic years 
2002/2003 and 2003/2004. 


Funding to create the tutorial was provided by the Learning, Teaching and Support 
Network for Information and Computer Science (LTSN-ICS) during the academic year 
2002/2003. 


Methods 
Guidance in action research methods informed the project. Action research has been 
described as a: 


... spiral of cycles of planning, acting (implementing plans), observing (systematically), 
reflecting and then replanning, further implementation, observing and reflecting (Kemmis 
and McTaggart, 1992, cited by Cohen ef al, 2000, p.229). 

The first cycle of activity involved: 


e following existing practice in the teaching of citing and referencing for all 
students new to the department during the academic year 2002/2003; and 


* recording details about the nature and frequency of errors made in a sample of 
student assignments produced in the autumn term. 
The second cycle of activity involved: 
* using the results to inform the design of an online tutorial; 
* introducing it to the same groups of students in the spring term; 


* repeating the exercise to record details about the errors made in the assignments 
' they subsequently produced; and 


* measuring the extent to which they had made use of the tutorial using WebCT’s 
, tracking facilities. 


The third cycle of activity involved: 


* drawing from the findings to plan use of the tutorial with all students new to the 
_ Department during the academic year 2003/2004; and 


* repeating the monitoring activities to compare the results with the previous year. 


Cycle 1: autumn term 2002 
In 2002/2003 the students involved were: 


* A total of 116 undergraduates following a new common first year programme for 

: students undertaking BA or BSc degrees in Information and Library 
Management, Information Management, Information and Communications, 
Web Content Management and Modern Languages and Internet Management. 


* A total of 58 postgraduates on the taught conversion Masters courses in 
’ Information and Library Management or Information Management. 


These students were given some face-to-face teaching on citing and referencing 
practice during the autumn term, and a copy of a booklet containing the department’s 
guidelines which had been used for several years. An online version of the booklet was 
also available on the department’s web-based intranet for reference at any time, and the 





importance of following the guidance was stressed by many tutors when assignments 
were set. 


Identifying the type and frequency of errors 

Checklists were created to record errors in the students’ references for each format of 
material, according to the existing guidelines. These were used to keep tallies of the 
total number of references and the number of references containing one or more errors. 
Every specific error was also counted in order to identify the type of errors being made, 
as shown in the example in Table I. 

All the undergraduate students’ reference lists for three pieces of work submitted at 
the end of autumn term were examined. As the postgraduates gave many more 
references, it was beyond the time available for the project to look at the work of all. 
Instead, random samples of the work by 20 students for each of three assignments 
were examined to ensure that references to a range of different types of material would 
be considered, according to the nature of the assignment. The results (Table I) showed 
that the majority of references were to books and electronic documents and a high 
number of errors were identified, despite the instruction received by the students. 

The detailed results showed that the tutorial needed to start at a more basic level 
than originally expected, for example, with some students giving the title before the 
author’s name, many not including the place of publication and publisher details or 
following the guidelines to put titles in italics. 


Cycle 2: spring term 2003 

Designing the tutorial 

The project was informed by the results of another project which took accessibility 
issues into account in the conversion of an existing online tutorial into a WebCT 
version (Kendall and Booth, 2003). Improvements in terms of accessibility have been 
made with each version of WebCT (2004), although some problems remain, for 


Tally Total 


Number of references examined on this sheet 
Number of references with one or more errors 
Missing place of publication 

Missing publisher details 

Missing date of publication 

Author’s first name in full 

Author’s name/initials precede surname 
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Table I. 


Title precedes author 
Title not in italics 


Example of a check-list 


for references to books 


Undergraduate 
Number Per cent 


References to books 114/184 62 
References to e-documents 142/167 85 


Postgraduate 
Number Per cent 
Table I. 
References containing 
one or more errors 


168/379 44 
98/122 80 
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example the use of frames and forms. However, in 30 case studies of WebCT courses, 
Pearson and Koppi (2002, p. 17) found that: 


...the methods, structure, design and presentation of materials by the designer may pose 
difficulties in accessing the learning environment for students with disabilities. These generic 
issues are not because of any constraints imposed by WebCT itself. 


The aim was to follow their guidance to avoid potential designer errors and provide 
options for adjustments to be made to meet specific needs, for example, by using 
cascading style sheets. 

The WebCT version of the first tutorial had been introduced to the same groups of 
students who were to use the tutorial on citing and referencing, during the autumn 
term. A feedback survey showed that most students experienced little difficulty with 
navigating the tutorial and suggestions for improvements were easily implemented. 

An advisory team of eight staff, including three from the University Library, was 
established for the citing and referencing project. Three meetings of the whole team 
were held at key stages in the project, supplemented by informal meetings between 
some members and communication via the department’s WebCT Research Forum. 
This had the benefits of enabling progress reports to be provided and the sharing of 
information gained from the consultation of other guides to citing and referencing, for 
example Shields and Walton (1995). A decision was made to follow the British 
Standards BS ISO 690-2 (British Standards Institution, 1999) and BS5605 (British 
Standards Institution, 1990) as closely as possible. These standards were already used 
in a short WebCT tutorial previously created by Mary Harrison from the University 
Library and Blackwell's, the University’s bookshop, sold a popular guide following the 
British Standards produced by Fisher and Hanstock (2003). 

The results from the exercise identifying errors informed the design of the tutorial 
to meet student needs. The first chapter was designed to add the elements of references 
to books gradually, with explanations as to why each is significant. Draft chapters 
were made available to the team for comment before being released to the students. 
The. WebCT Research Forum enabled discussion of specific issues arising as the 
tutorial was being written, for example, variance between the existing departmental 
guidance on referencing book chapters and the British Standards. It also enabled other 
staff in the department to follow the progress of the project and contribute if they 
wished. 


Implementation 

The Citing Proficiency Test tutorial was introduced to the first year undergraduate 
students as an integral part of their work on the Learning, Communications and 
Technology unit. Following discussion with the two tutors delivering the unit, it was 
agreed to assign 10/150 marks for their assessed portfolio of work to their performance 
in the five quizzes contained in the tutorial. The four chapters were released weekly 
and the students were able to complete the five quizzes as many times as they wished 
with their highest scores being recorded. 

The idea for the name of the tutorial developed from the decision to assess the 
quizzes. The Cycling Proficiency Test is well known throughout the UK for road 
safety, and calling the tutorial the Citing Proficiency Test provided opportunities for 
the lighter touch desired to make completing the tutorial enjoyable. 








A different approach was taken for the introduction of the tutorial to the 
postgraduates as less time was available in their timetable. The whole tutorial was 
released at once and they were given a short introduction to it during a time-tabled 
class for their Information Retrieval unit. The expectation was that the students would 
be sufficiently motivated to complete the tutorial prior to their submission of further 
assignments at the end of the spring and start of the summer terms. A fifth chapter on 
Endnote, the bibliographic management software available in the University, was also 
released to the postgraduates as they would shortly be starting work on their 
dissertations. 


Student reactions 

The students’ reactions to the tutorial were positive during the timetabled classes, as 
reported by all the tutors concerned. As the undergraduates’ experiences of online 
learning were being surveyed as part of a wider research project being conducted by 
the Learning and Teaching Support Network (SOLE, 2004), it was decided not to carry 
out a survey to collect their views. Instead, comments were invited as part of their 
feedback on the Learning Communication and Technology unit. The postgraduates 
were also being extensively surveyed at the end of their taught course, so their views 
were also collected as part of other feedback. A small number of comments, all positive, 
were received from both groups, for example: 


The quizzes are really helpful and test my knowledge there and then to see whether I have 
understood the content. 


I personally feel much happier now with citing references. 


Very useful, easy to understand and can serve as both a learning and a reference tool. Y'm 
likely to remember what I read here rather than information in a booklet. 


Comparison of errors made before and after completion of the tutorial 

In order to provide some indication of the impact of the tutorial, the exercise to identify 
the frequency and type of errors made was repeated. This was an attempt to follow the 
guidance given by Collis and Moonen (2001, p. 129): 


What we are most interested in regarding learning as a consequence of using technology 

often can’t be measured in the short term or without different approaches to measurement. 

Measure what can be measured, such as short-term gains in efficiency or increases in 

flexibility. 
The results indicate some improvements in the references given by postgraduates 
(Table IID, although this can only be a cautious claim as again it was only possible to 
look at random samples of the work by 20 students for each of three assignments. 

For the undergraduates, the work of all students was examined and 
disappointingly, the initial comparison showed little indication of improvement, with 
an increase in the percentages of references containing one or more errors. There was, 
however, an increase in the total number of references given by undergraduates. This 
could indicate a greater awareness of the importance of referring to sources, although 
the range of formats remained limited. 

However, comparison of the detailed information gathered about the errors showed 
that there had been improvements in presenting references by both the undergraduates 
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Table M. 

References containing 
one or more errors before 
and after completing the 
tutorial 


Table IV. 

Most frequent types of 
errors made in references 
to books 





and postgraduates. The results were distorted by errors resulting from the changed 
guidelines requiring closer conformance to the British standards. For example, the 
requirement to put the author’s name in upper case letters was ignored in 67 per cent of 
the references by undergraduates, which accounts for most of the increase in the 
number of references to books containing one or more errors as shown in Table IV. As 
a tally had been kept only of references with one or more errors, it was not possible to 
subtract those with only this error at a later date. 

Clearer guidance was given in the tutorial on e-documents than in the department’s 
booklet, which is likely to have led to the marked improvements shown in Table IV. 
However, a significant number of references by the undergraduate students just gave 
the Uniform Resource Locator (URL) rather than a full reference both before and after 
completing the tutorial (see Table V). This may be influenced by the fact that it is 
increasingly common just to give the URL, for example, in newspaper articles, and it 
may appear unnecessary to students to give more detail. 


Tracking student use of the tutorial 

WebCT provides detailed facilities for tracking student use of online materials. The 
postgraduates completed the tutorial in March and some indication of the value placed 
on it is shown by the continued use of it by 20 students in subsequent months. For the 











Undergraduates Postgraduates 
Before er Before After 
Format Number Per cent Number Per cent Number Per cent Number Per cent 
Books 114/184 619 277/813 885 168/879 443 119/367 324 
Book chapters 4/5 80 21/28 75 14/29 48.3 
Reports 10/11 90.9 7/11 63.6 
Print journal articles 8/13 61.5 100/150 66.6 45/171 263 
E-journal articles 26/26 100 18/67 26.8 
E-documents 142/167 85 193/213 906 98/122 80.3 125/269 465 
Frequency of occurrence 
Undergraduates Postgraduates 
Type cf error Before (%) After (%) Before (%) After (%) 
Missing contents 
Place of publication 70.1 46.9 41.6 319 
Publisher details 38.6 288 1.7 
Date of publication 14 3.9 48 0.8 
Presentation 
Author not in upper case n/a 67 n/a 50.4 
Author’s first name in full 254 288 36.3 8.4 
Author’s name/initials precede surname 219 10.8 
Title precedes author 10.5 0.4 
Title not in italics 55.2 25.6 30.4 20.2 


Note: Each type of error is expressed as a percentage of the total number of references to books 
containing one or more errors (see Table HD) 





undergraduate students, checks were made on a weekly basis to see if the students had Tackling 


logged on to the WebCT area for the Learning, Communications and Technology unit referencing 
and they were told that this would be used as an electronic register. The original plan 
had been to use WebCT just for this tutorial, but the interest of the tutors led to use errors 


being extended to support the students’ work on group presentations. The majority of 
students logged on regularly, but as they were also using the course area for assessed 
group work, they may have been logging on for other reasons than completing the 137 
tutorial. Around 20 students did not participate, but these students had a poor ————— 
attendance record for this subject, despite warning letters having being sent to them. 

Given that fewer improvements were made by the undergraduate students, a more 
detailed investigation was carried out into their use of the tutorial. The tracking 
facilities show which pages were accessed and on which date. A summary of the extent 
to which the tutorial was accessed is given in Table VI. As it would appear that 29 (25 
per cent) of the students had not received any additional instruction on citing and 
referencing practice, this may give some explanation for the persistent errors. 

More students may have in fact used the tutorial, for example, through sharing and 
copying printouts of the text, but as only 81 (70 per cent) of the students attempted the 
quizzes, it is likely that such additional usage was limited. The students were allowed 
to attempt the quizzes as many times as they wished, so could obtain full marks 
through repetition. A check of the history of quiz attempts showed that most gaining 
full marks had made more than one attempt. The incentive of this counting for 10/150 
marks for their assessment of the unit appears to have worked for the majority, but 
some were satisfied with less as shown by Table VII. The number of errors made after 
the tutorial may indicate that the students had not retained the information needed to 
transfer to their own practice, or it may indicate that there was some sharing of the 
correct answers between students who had not fully understood the requirements. 


Frequency of occurrence 








Type of error Undergraduates Postgraduates 

Missing contents Before (%) After (%) Before (%) After (%) 

Author 7 6.7 41.8 10.4 

Type of medium (online) 34.5 3.6 28.6 7.2 

Date of consultation 42.9 9.8 45.9 3.2 Table V 

Just URL given 49.3 54.9 1.6 Most frequent types of 
Note: Each type of error is expressed as a percentage of the total number of references to e-documents errors made in references 
containing one or more errors (see Table IM) to e-documents 
Proportion of the tutorial accessed Number of students Percentage 

All of the tutorial 49 42.2 

Three chapters 23 19.8 

Two chapters 14 12.1 

One chapter 1 0.9 Table VI. 
None 29 25 Undergraduate student 


Total 116 100 use of the tutorial 
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Table VII. 
Undergraduate marks on 
quizzes 





Usabitity testing 

A researcher with experience gained through employment at the National Library for 
the Blind carried out testing of the tutorial with the screen reader JAWS (version 3.7). 
Despite the use of frames by WebCT, she found it possible to navigate the tutorial, but 
felt that a visually impaired student would need some additional training and support 
when new to using WebCT. 

An attempt at undertaking a user testing study with disabled students had been 
made as part of the other project in the autumn term 2002, but e-mail requests for 
volunteers sent out by the University’s Learning Support Unit had twice been 
unsuccessful. In the summer term, funding was provided by the Faculty for nine 
students with specific learning difficulties to undertake ten paid hours of user testing. 
This group included students with dyslexia and was chosen because this is the most 
common disability at the university. This time the request via the Learning Support 
Unit resulted in a good response and students from other departments across the 
university took part. They were asked to complete feedback and log sheets as they 
worked through both tutorials and to attend a plenary focus group discussion. 

None found it difficult to navigate or understand the Citing Proficiency Test 
tutorial. Three used software to help them read on-screen information, such as 
TextHelp Read and Write. Some had not known how to alter the appearance of the text 
and: background using Internet Explorer’s accessibility features. Two preferred the 
Comic Sans font to Arial and one said “the problem you are going to have with 
anything like that is each person is different. I like black on pastel green, someone else 
will like red on blue — you're not going to cater for everyone.” All agreed that giving 
people choices was what mattered. 


Reflection 

While the usability testing was only on a small scale, it helped to inform future practice 
by highlighting the need for greater awareness-raising of ways of adjusting the 
settings in Internet Explorer. Until further improvements are made in future editions of 
the WebCT software, tutors need to know how to provide options for adjustments to 
meet specific needs. For example, the University of Aberdeen (2003) recommends that: 


Adding the Compile Tool to your site, either on the Home page, or on a subsidiary page, may 
help screen reader users. This allows a series of pages to be viewed as a single document for 
printing, or copied and pasted into Word. 


The results of the before and after comparison of student referencing errors in Cycle 2 
showed that some improvements had been made. The postgraduate students achieved 
a higher rate of improvement, maybe because they were sufficiently well motivated to 


Quiz scores Number of students Percentage 
Full marks on all 5 49 60.5 
Full marks on 4/5 9 11.1 
Full marks on 3/5 6 74 
Full marks on 2/5 6 7.4 
Full marks on 1/5 4 49 
Full marks on 0/5 7 86 
Total number of students attempting quizzes 81 100 











complete the tutorial independently. Some improvements may have resulted from 
students being given guidance by tutors when their work from the autumn term was 
returned, rather than from the tutorial. However, since the usual practice was not to 
make detailed corrections, the extent of such influence is likely only to have been small. 

A change in departmental practice part way through the year may also have caused 
some confusion, particularly since tutors’ reading lists were not changed from the 
original practice. The high number of students ignoring the guidance to put authors’ 
surnames in upper case was attributed to this change. Repeating the exercise to 
identify the extent and nature of the errors made in 2003/2004 would indicate whether 
this was the case. 

The non-use of the tutorial by one-quarter of the undergraduates needs to be seen in 
the context of the wider problems of poor attendance and low motivation of this cohort 
of first year students of the Learning, Communications and Technology Unit. Some of 
the students in greatest need of the core skills taught through the unit opted not to 
participate. This was identified as an issue to be addressed in the academic year 
2003/2004. 


Cycle 3: 2003/2004 

Replanning 

The department’s staff agreed to adopt the changes in practice introduced in the 
tutorial for all students from September 2003, including second and third year 
undergraduates, An agreed benefit of making the tutorial available to the students 
throughout their studies was that it would enable tutors to recommend returning to it 
for remedial action and support. To support this decision, the following actions were 
carried out at the start of the academic year: 


* The checklist used to identify errors was developed into a guidance sheet for 
tutors to use in giving feedback. 


* Coloured, laminated cards giving sample references for different formats of 
material drawn from the tutorial, one for the Harvard System and one for the 
Numeric System. These were to be given to new students after completing the 
tutorial as a memory aid and distributed to returning students. 


e On their return into the second year, the undergraduates were reminded in 
classes about the importance of citing and referencing correctly, recommended to 
revisit the tutorial and made aware that the tracking facilities had shown some 
non-use. The third year students were introduced to the tutorial in a briefing 
about their major projects and dissertations. 


* All new students were shown how to use the accessibility features on Internet 


Explorer. 


The tutors responsible for the Learning Communications and Technology Unit were 
interested by the students’ positive reactions to a WebCT tutorial in class and the 
potential for the quiz and tracking facilities. This led them to reconsider the way in 
which the unit was taught, with the aim of improving student attendance and 
motivation. For the 2003/2004 academic year, they decided to replace large lectures 
with WebCT tutorials. This would enable greater levels of support through regular 
small group meetings with personal tutors, in addition to weekly face-to-face seminars 
or lab sessions. The programme included online tutorials on time management and 
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Table VII. 
Undergraduate student 
feedback 


academic writing preceding the introduction of the Citing Proficiency Test tutorial in 
the sixth week of autumn term. The allocation of 10/150 marks for completing the five 
quizzes as part of the assessed portfolio of work was considered to be an important 
incentive and included in the autumn term rather than the spring term. 

The postgraduate programme was extensively revised in the summer term 2003. 
This included plans for further use of WebCT tutorials throughout the academic year 
for a new Information Environments Unit. The Citing Proficiency Test tutorial was 
incorporated into the programme for this unit and one week allowed for its completion 
in the fifth week of the autumn term. 


Implementation and student reactions 
In 2003/2004, the numbers of students new to the department were: 


* a total of 96 undergraduates following the common first year programme; and 
* a total of 49 postgraduates following the taught Master’s courses. 


The students’ reactions to the tutorial were again positive during the timetabled 
classes, but for both groups, the “novelty” factor was reduced as they were using 
WebCT more extensively. At the end of the year, a survey of the undergraduate 
students about the use of WebCT included a question about this tutorial as well as 
others used in the Learning Communication and Technology unit. Although only 26 
(27.1 per cent) of the students responded, the results were generally positive as shown 
in Table VIL 
Another question in the survey asked the students to tick statements applying to 
them, including the following: 
* [liked the flexibility of being able to use the tutorials whenever it suited me — 23 
(88 per cent). 
* I found learning this way helped me to concentrate on the topic — 13 (50 per cent). 
* I found learning this way helped me to remember what I'd learned — 15 (58 per 
cent). 


+ Ifound the self-tests and quizzes useful in helping me check my understanding — 
21 (81 per cent). 


Comparison of errors made in 2002/2003 and 2003/2004 
At the end of the academic year, the exercise to identify the extent and nature of 
student errors was repeated. Again, all the undergraduate students’ reference lists for 





How useful was the Citing Proficiency Test tutorial in supporting your studies? 
umber 


Per cent 
Very useful i 42 
Useful 10 38 
No strong feelings 3 ll 
Limited usefulness 1 4 


Not useful 
Did not use 1 4 














three pieces of work and random samples of the work by 20 postgraduate students for Tackling 
each of three assignments were examined. For each group, assignments from the referencing 
beginning, middle and end of the year were selected. Comparison with the results from 
Cycle 2 indicated improvements all round as shown in Table IX although the number errors 
of errors made by the undergraduate students remained relatively high in comparison 

with the postgraduates. 

The difference in performance between the years indicates that the change in 141 
practice part way through the year in 2002/2003 was likely to have been a factor. As 
shown in Table X, the requirement to put authors’ names in upper case was followed 
more often in 2003/2004, although the percentage of non-compliance by 
undergraduates remained high at 49.7 per cent. The British Standard is unusual in 
making this requirement and it is less likely to be familiar to students from their 
previous experiences. For both groups of students, there was also an increase in the 
percentage of references without titles in italics. In both these cases, there may have 
been some conflicting advice in relation to accessible design of web pages, for which 
the use of upper case and italics is discouraged. An introduction to web page design is 
part of a mandatory unit for all Stage 1 and postgraduate students in the department. 





Undergraduates Postgraduates 
2002/2003 2003/2004 2002/2003 2003/2004. 
Format Number Percent Number Percent Number Percent Number Percent 
Books 277/313 88 185/386 48 119/367 32 33/230 14 
Book chapters 4/5 80 1/3 33 14/29 48 1/28 3.6 Table IX. 
Reports 7/11 63 Comparison of references 
Print journal articles 8/13 61 9/20 45 45/171 26 11/13 8.5 containing one or more 
E-journal articles 2/8 25 18/67 26 8/101 79 errors after completing 
E-documents 193/213 91 89/305 29 125/269 46 12/219 55 the tutorial 
Frequency of occurrence 
Undergraduates Postgraduates 

Type of error 2002/2003 (%) 2003/2004 (%) 2002/2003 (%) 2003/2004 (%) 
Missing contents 
Place of publication 46.9 15.1 319 24.2 
Publisher details 28.8 9.7 17 18.2 
Date of publication 39 16 08 
Presentation 
Author not in upper case 67 49.7 50.4 24.2 
Author’s first name in full 28.8 17.8 8.4 27.3 
Authors name/initials precede 

surname 10.8 6.5 Table X. 
Title precedes author 0.4 3.8 TA t an 
Title not in italics 25.6 58.9 20.2 60.6 fedaent types of errors 
Notes: Each type of error is expressed as a percentage of the total number of references to books made in references to 
containing one or more errors (see Table IX); NB: only 33/230 postgraduate references to books books after completing 


contained one or more errors in 2003/2004 the tutorial 
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Table XI. 

Comparison of most 
frequent types of errors 
made in references to 
electronic documents 
after completing the 
tutorial 


Table XI. 
Comparison of 
undergraduate student 
use of the tutorial 


The total number of errors in references to electronic documents was also reduced, but 
examination of the nature of the errors showed continued high percentages of 
references from undergraduates giving just the URL, as shown in Table XI. This 
indicates a need for further emphasis and explanation in the online tutorial and by staff 
when giving feedback. 


Tracking undergraduate student use of the tutorial 

As in the previous year, a more detailed investigation into the use of the tutorial by the 
undergraduate students was considered necessary, given that the reduction in the 
number of errors was less in comparison with the postgraduates. The results in 
Table XII show that a smaller proportion of students accessed the whole of the tutorial 
in 2003/2004 compared with 2002/2003 and that the numbers completing only one or 
two. chapters had increased. This may have resulted from all four chapters being 
released at once in 2003/2004 rather than week by week as in the previous year. 

As in the previous year, most gaining full marks in the quizzes had made several 
attempts. However, there was a reduction in the number of students gaining full marks 
on all five quizzes, as shown in Table XII, although the proportion of students 
attempting the quizzes had increased. 


Discussion and conclusions 
Overall, the initiative has had an effect in reducing the number of errors in references 
made by both the undergraduate and postgraduate students. However, as there are 


Frequency of occurrence 


Undergraduates Postgraduates 
Type of error 2002/2003 (%) 2003/2004 (%) 2002/2003 (%) 2003/2004 (%) 
Missing contents 
Author 6.7 112 10.4 33.3 
Type of medium (online) 3.6 19 7.2 
Date iof consultation 9.8 18 3.2 16.6 
Just URL given 54.9 47.2 1.6 25 


Notes: Each type of error is expressed as a percentage of the total number of references to 
e-documents with one or more errors (see Table IX); NB: only 12/219 postgraduate references to 
e-documents contained one or more errors in 2003/2004 


Students 2002/2003 Students 2003/2004 
Proportion of the tutorial accessed Number Per cent Number Per cent 
All of the tutorial 49 42.2 33 34.4 
Three chapters 23 19.8 9 9.4 
Two chapters 14 121 10 10.4 
One chapter 1 0.9 19 19.8 
None 29 25 25 26 


Total 116 100 96 100 








other variables besides the introduction of the WebCT tutorial which may have led to 
this improvement, some caution needs to be used in interpreting the results. 

The checklists to record errors in the students’ references were reliable research 
instruments as it was possible to repeat the exercise in the second year to provide a 
valid indication of the impact of the interventions. Although the initial “before and 
after” comparison of the results for undergraduate students indicated that the 
proportion of errors had increased rather than decreased (Table I), the more detailed 
results showed that there had in fact been improvements (Table [V), but indicated that 
the change in practice to closer compliance with the British Standards in requiring 
authors’ names to be in upper case may have caused some confusion. In 2003/2004, the 
drop in the number of errors showed that this was likely to have been a factor 
(Table IX). If the checklist were to be used again, an improvement would be to keep 
separate tallies of references with one, two, three and more errors, rather than a simple 
tally with one or more errors. This would make it possible to identify whether one 
particular error was affecting the results, but would have the disadvantage of adding 
to an already time consuming process. 

As it was only possible to examine samples of the students’ work given time 
constraints, the results are indicative rather than conclusive. Individual differences in 
ability also need to be taken into account in comparing groups of students with each 
other, rather than tracking individuals over time. The latter was considered by the 
project team initially, but rejected because of the lack of anonymity for students. 
Reassurances were given to the students that analysis of their reference lists would 
take place after the work had been marked by their tutors, would not affect their results 
in any way and that no names would be recorded. 

Detailed investigation through WebCT’s tracking tools shows that caution is 
needed in attributing the decreasing number of errors made by undergraduates to the 
WebCT tutorial. While use of the tutorial suited some students, as shown in the 
feedback given and in their performance in the quizzes, others only partially completed 
the tutorial and the number of non-participants remained high in both years (Table XID. 
In the second year, the proportion of students attempting the quizzes increased, but the 
numbers taking the opportunity to gain full marks decreased (Table XII). The 
prevalent attitude among many students of doing simply “enough to pass” may apply 
regardless of the method of delivery. Possible solutions for 2004/2005 may be to require 
all quizzes to be attempted before any marks are given, to increase the marks available 
to encourage completion, or to only count quizzes for which full marks had been 


Students 2002/2003 Students 2003/2004 

Quiz scores Number Per cent Number Per cent 
Full marks on all 5 49 60.5 22 29.3 
Full marks on 4/5 9 11.1 18 24 
Full marks on 3/5 6 74 7 9.3 
Full marks on 2/5 6 74 8 10.7 
Full marks on 1/5 4 49 13 17.3 
Full marks on 0/5 7 8.6 7 7.2 


Total number of students attempting quizzes 81/116 69.8 7596 781 
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Table XII. 
Comparison of 
undergraduate marks on 
quizzes 





57,2 
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achieved. However, the latter option might have the disadvantage of encouraging 
cheating. 

The other interventions in 2003/2004 are also likely to have had an impact. The 
laminated cards have been popular with both staff and students, and the guidance 
sheet for tutors may have increased consistency in giving feedback that accurate 
references are important. 

The ability to pay attention to detail and follow instructions accurately is 
particularly important for all students in the department, whether they wish to become 
librarians or web designers. In this respect, the emphasis given to referencing practice 
at an early stage in their studies is of wider benefit. However, concerns have been 
raised by both staff and students about whether it is advisable to follow the guidance 
in the British Standards about putting authors’ names in upper case and using italics 
as these can cause accessibility problems when publishing on the web. When 
necessary, adjustments can be made easily to the online tutorial to accommodate the 
needs of disabled students, but this contradicts advice in teaching web design that it is 
preferable to adopt a universal design approach. It may be timely for the British 
Standards to be reviewed to take the changing needs of electronic publishing into 
account. 

Although the initiative began as a small project, it has led to wider developments, as 
is often the case with action research (Cohen et al., 2000). In the university, interest in 
the tutorial led to it being recommended in the Faculty Academic Standards Committee 
Code of Practice for use by all departments and to collaboration with colleagues from 
the Department of Sociology in the design of a companion interactive WebCT tutorial 
to help students avoid plagiarism, to be introduced in 2004/2005. As the creation of the 
Citing Proficiency Test tutorial was financed by the LTSN-ICS, guest access to view 
the tutorial and obtain copies is available by contacting the author. External interest 
has'led to the tutorial being adopted for use by the Information Management School at 
London Metropolitan University and the Department of Continuing Education at the 
University of Manchester. 

The project has also contributed to the department’s strategy to extend the use of 
WebCT through the creation of learning materials for which interactive, online 
delivery is most appropriate. Prior to the initiatives in 2002/2003, WebCT had only 
been used for three optional units involving some postgraduates and some 
undergraduates in their second and third years. The project helped to mainstream 
the use of WebCT through its use with all students new to the department in 
2002/2003, involved a greater number of staff in delivery via WebCT and acted as a 
catalyst for further developments in the use of WebCT in 2003/2004. In 2004/2005, 
further developments are planned and finding some measures, however small, through 
which the effectiveness of initiatives can be judged, will continue to be important in 
future cycles of activity. Lessons learnt from tracking student participation and 
performance through the tools available in WebCT in this project will be applicable to 
other initiatives. Although indicators of success need to take into account other 
variables, they provide some accountability for the investment in creating online 
learning resources. While student use of the online tutorial on citing and referencing 
can' only be seen as making a partial contribution to improvements in their 
performance, it has had an impact on some of the students’ learning experiences which 
may benefit them in the longer term. 








Note 


1. WebCT is commercial software which allows lecturers, without the need for programming 
skills, to manage a sequence of web pages as an online tutorial incorporating features such 
as quizzes and bulletin boards. 
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_How to manage the big bang: 
evolution or revolution in the 
introduction of an MLE? 
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Abstract ; 

Purpose — To examine how introducing an institution-wide managed learning environment impacts 
on the processes of organisational change using City University, London as a case study. 
Design/methodology/approach — Literature-based discussion of current issues around the 
introduction of online learning to provide theoretical framework. Action research methodology used 
for interviews with leading members of the institution. 

Findings — There is a significant amount of literature available on institutional change and managed 
learning environments; however, how the introduction of such systems operates in practice depends on 
the context of the institution. In the interviews with key stakeholders six significant themes are 
identified for the management of change in this area: pedagogic direction; operational connections and 
development; organisational structure and change; system process; professional development; 
strategic vision and perception. Any implementation project regarding the introduction of managed 
learning environments should encompass these key themes. 

Research limitations/implications — Based on interviews with a small number of stakeholders at 
the institution. Further research could compare the experience at City with other institutions and 
revisit a wider selection of stakeholders at City to assess their views at a later stage in the 
implementation. 

Practical implications — Provides guidance after the experiences encountered at the institution 
which could assist other universities both during the planning phases of such a project or during the 
implementation itself. 

Originality/value — Identifies a number of key areas to shape and formulate project management. 
Combines empirical evidence with theoretical context. 


Keywords Learning organizations, Computer based learning, Organizational change 
Paper type Case study 


Introduction 

“Higher education cannot change easily”, writes Diana Laurillard, yet, she concedes 
that it is being “forced to change” (Laurillard, 2002, p. 3) and the introduction of 
wide-scale e-learning to UK higher education (HE) campuses is one of the key drivers of 
this pressure on institutions to reinvent themselves. Indeed one of the main factors in 
this process centres around the move from small-scale e-learning initiatives, often 


The authors would like to thank those members of staff at City University who agreed to be 
interviewed for this research, Melanie Sanderson for her assistance with the data collection and 
other members of the ELU team for providing inspiration and feedback. An earlier version of this 
article appeared in the Proceedings of the Networked Learning Conference in 2004 — see www. 
shef.ac.uk/nlc2004/ Proceedings/Individual_Papers/ Quinsee_Summer.htm (accessed 5 January 
2005). 








using a virtual learning environment (VLE) to centrally managed, joined up systems 
and processes that are institution-wide through the creation of a managed learning 
environment (MLE)[1]. MLEs are regarded as favourable due to economies of scale and 
efficiency and because they can “streamline” the student experience (see Lee, 2003). 
However, as Britain (2001) cautions, the introduction of “new technology into an 
organisation will necessarily involve a process of change”; change can even be seen as 
“the reason for adopting the technology”. 

Yet, while there seems to be widespread agreement that technology and change are 
inextricably linked, particularly in relation to the HE environment, there is less 
consensus on how such change can be implemented or embraced. A number of models 
have been put forward to help shape the philosophy and direction of change, such as 
Laurillard’s (2002, p. 215) notion that institutions need to become “learning 
organisations”, but while HE institutions are regarded as resistant to change there 
is an obvious tension here. In addition, many UK HE institutions are fiercely 
independent of their individuality. As Stiles (2003, p. 2) notes, a “need for the 
organisation to become ‘distinctive’ in a changing and competitive [. . .] sector” can bea 
major driver for reviewing teaching and learning practice and strategy. Boys (2002, 
p. 10) reminds us that: 


.. the requirements of scaling-up and integration demanded by an MLE necessarily throw 
into relief the inherent tensions in large complex organisations with different stakeholder 
perspectives. 


So where does this leave a UK HE institution in the process of implementing an MLE? 
Faced with often conflicting internal and external drivers and levers, the introduction 
of an MLE can seem like a panacea, a placebo or an inevitable consequence of the 
changing HE marketplace. The aims of this paper are to first contribute to this growing 
body of research evidence on issues surrounding the management of institutional 
change, with specific reference to the implementation of an MLE at City University, 
and second to promote the practice of action research as a way of facilitating the 
management of such change (see for example, Searle, 2003; JISC, 2002b; Foster et al, 
1999; Steeples and Jones, 2002; Collis and van der Wende, 2002). 
As Jane Searle (2003, p. 11) argues: 


.. in most accounts of institutional change there is a recognition that successful institutional 
implementation of learning technologies depends on key individual stakeholders. 


Semi-structured interviews carried out in recent months with several key decision 
makers at City University has formed the basis of this research. Although the 
investigations are ongoing, the paper provides a snapshot of the institution at a critical 
moment in the implementation of an MLE. It has been suggested that “there are two 
basic paradigms for MLE development, one concerned with merely integrating 
existing systems and the other with rethinking educational and organizational 
processes” (Boys, 2002, p. 10). We might characterize these as evolutionary and 
revolutionary respectively. Before considering how this applies to City, we consider the 
theoretical perspective more fully. 


Theory and method 
Much of the growing literature on change in HE in general, including the impact of 
technology, has been characterised by attempts to map.experience and empirical data 
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to models of change. Given the essentially integrative nature of MLEs it is clear that 
the:full-scale implementation of an MLE potentially involves all aspects of educational 
and organizational processes. Inevitably this embraces the multiple organizational 
cultures that constitute a modern university. As Adrianna Kezar (2001, p. 2) has 
observed: 


.. the need for cultural models seems clear from the embeddedness of members who create 
and reproduce the history and values, the stable nature of employment, the strong 
organizational identification of members, the emphasis on values. 


Additionally, as the evidence analysed below suggests, social cognition models are 
relevant to any analysis of organizational change since, at a basic level, there are 
multiple interpretations of what an MLE is, as well as competing visions of what it can 
be used for. 

As far as the speed of change is concerned a recent Universities and Colleges 
Information Systems Association (UCISA) report by Browne and Jenkins (2003) notes 
that among UK Higher Education and Colleges “the overall picture is one of 
evolutionary consolidation”. The previous study carried out by Collis and van der 
Wende (2002) on the use of ICT in HE in general and the uptake of VLEs/MLEs in 
particular, concludes that while change is indeed slow, “nevertheless institutions are 
gradually ‘stretching the mould’” although “changes [...] are gradual and usually 
slow” (Collis and van der Wende, 2002, p. 7). The present study has been framed with 
reference to both of these aspects the models and speed of change. 

Fullan (1991) has drawn attention to the importance of examining the subjective 
meaning of change for those involved in the process, pointing out that subjective 
meanings may be different not only for individuals but for groups of individuals, be 
they academics, managers or from support services. Since all three groups are (or 
should be) involved in the rollout of an MLE, 15 key decision makers, five individuals 
from each of these areas at City were invited to participate in recorded interviews about 
the implementation of the MLE. The E-Learning Unit (ELU), charged with leading the 
e-learning initiative at City, decided to carry out this research in order to help shape its 
future planning agenda. While there were some strategic objectives established in the 
institution around e-learning, it was felt that these may not be apparent to all within 
the organisation or the effects of a wide-scale implementation may be viewed 
differently. This decision to undertake such action research was triggered in part by a 
wish to see how senior decision makers in the university perceived the impact of 
e-learning on core business activities. 

Bentz and Shapiro (1998, p. 127) define action research as: 


.. less a separate culture of enquiry than [...] a statement of iniention and values. The 
intention is to change a system, and the values are those of participation, self-determination, 
empowerment through knowledge, and change. 


By interviewing various stakeholders from the academic, management, administrative 
and support units within City this research intends to encourage this notion of 
reflective practitionership. The interview process itself involves a debate about the 
issues which result from the acquisition of an MLE and provides a forum for the 
dissemination of information as well as an opportunity for contributing to the decision 
making processes. Of the key stakeholders initially identified, seven have been 
interviewed thus far. While there is a reasonable spread of academics and senior 
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managers from both academic and support services there has been no positive 
response from those with a specifically technical responsibility. Whether or not this 
reflects a feeling of exclusion from the wider implications of MLE implementation has 
yet to be established. 

Staff from the ELU designed and piloted an interview schedule which focused on 
the following key areas: 


* implications for infrastructure development and pedagogic direction; 
* the drivers behind MLE procurement; 


* responsibility for producing an e-strategy nd perceptions of the role of the MLE 
in this; 


* staff support and development; 
* student support; and 
* strategic vision — evolution or revolution? 


Semi-structured interviews were chosen as the appropriate method; such an 
instrument allows the exploration of shared or contested meanings of some of the 
key terms and issues involved in the interview process, such as e-learning, MLE/VLE, 
the future direction of the university and its e-vision. The interviews were recorded and 
full transcripts made available to each of the interviewees with a guarantee of 
confidentiality. These interviews were carried out six months into the implementation 
process over the Christmas period 2003-2004. A total of six interviews were carried out; 
three with administrative/support staff and three with academic staff. 


Context 

City University has been delivering online learning for over four years in a distributed 
model — certain departments and individuals have been pioneering new technologies, 
through VLEs and other web-based solutions, while other areas have been largely 
untouched by new e-learning initiatives. The rationale for developing such e-learning 
offerings has been mixed from developing new delivery modes to increasing and 
widening participation rates to experimenting with more innovative methods of 
classroom delivery. Such an evolutionary developmental model for e-learning 
implementation is not unfamiliar to other institutions, as recorded by Browne and 
Jenkins (2003). A UCISA (2001) report described VLEs as “part of a continuum of 
development” and their deployment at institutional level as symptomatic of an 
institution reaching an “innovative” stage of development (Browne and Jenkins, 2003, 
p. 24). Such a process is mirrored in the experience at City — this is common in pre-1992 
universities which are often characterised by a devolved decision-making model. 

In 2003, the situation at City radically changed with a high-level strategic 
commitment to rollout e-learning across the institution. This was evidenced by the 
establishment of the ELU and the purchase of a site-wide license to an MLE. There 
were a number of reasons why this change in policy occurred and the development 
came at an apposite time. First, the pressure on localised initiatives had grown to a 
level where the initiatives could no longer be sustained efficiently. Second, interest 
across the institution was growing and there was a concern that inefficiencies were 
occurring through repetition and inadequate resource sharing. Third, a number of 
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other university initiatives to improve City’s “e-readiness” (both from a learning and 
teaching and administrative perspective) were now in progress[2]. 

The E-Learning Unit (ELU) was placed at the centre of the e-learning process by 
both managing the rollout of the MLE over the summer of 2003 and supporting all staff 
engaged with online learning in the university. In order to complete this large task in a 
short time a set of priorities was drawn up including technical implementation and 
integration; staff development and training; migration of existing material; student 
support and guidance to ensure that the project achieved its targets. In September 
2003, the MLE went live on time and on target with over 70 modules operational to 
2,000 students. The experience that the institution underwent in achieving this result in 
such a short time scale is an important one and one which can provide useful evidence 
for other institutions faced with a similar situation. These interviews conducted with 
senior management enabled us to discover the perceptions on the e-learning initiative 
six months on. 


Findings 

The e-learning initiative at City transepts the boundaries between academic activities, 
administrative activities and support areas. In order for the MLE to function efficiently 
it needs to pull information from all these different systems and act as a conduit 
between the various business processes of the university. Although City is still in the 
early stages of this implementation, a number of key findings have resulted from the 
changes experienced by the institution. These can be grouped into the following areas, 
which loosely map onto the major themes of the interviews listed above: 


e Pedagogic direction — the impact of e-learning on existing and new modes of 
learning. 

* Operational connections and development — relationships between registry, 
administrators and academics. 


* Organisational structure and change — where to situate e-learning, who has 
responsibility? 

* System process — technological constraints. 

* Professional development — how to educate staff and students. 

- Strategic vision and perception — what is e-learning all about? 


Pedagogic direction 

Despite the relatively early stages of the e-learning initiative at City there has been 
some re-evaluation of pedagogic models with the advent of online learning. The main 
principle behind the wide-scale adoption of the MLE has shifted from supporting 
flexible modes of learning to supporting face-to-face teaching. 

However, there are divided opinions as to the effectiveness of online learning on 
students’ knowledge acquisition. There is some feeling that learning will become more 
student-centred; one senior academic commented that it will “make people become 
more student focused because when designing [a] VLE at the centre of that design 
schedule should be the student’s experience”. Another academic argued that 
“e-learning can only ever be a subset of teaching and learning” and he was concerned 





that the primary motivation for the introductior: for the MLE was not on pedagogic 
grounds. Yet, he continued that “e-learning is part of [the] infrastructural support for 
teaching and learning”. There was a definite understanding from academic staff as to 
how e-learning could contribute to the direction of learning and teaching activities 
within the university. However, a senior administrator was less sure; while conceding 
that e-learning “supports a strategy for excellence in professional teaching” he did not 
regard the MLE itself as automatically enhancing teaching and learning. 

While there are often debates about whether the technology is driving the 
pedagogy, our experience is that these drivers need to be considered together. MLEs 
can facilitate innovation if carefully used but they can also become merely expensive 
document repositories of PowerPoint slides or Word files. And if the technological 
infrastructure is not in place to support these modes of learning then the initiative will 
inevitably fail. 

Key lesson: ensure that a pedagogic focus is maintained at each stage in the project 
and that this is communicated to all staff, whether academics, administrators or 
support. 


Operational connections and development 

One of the key factors in ensuring connectedness in terms of implementation at City 
has been the location of the ELU within the organisational structure. As part of Library 
Information Services and ultimately Information Services, communication and 
operational relationships have been optimised while maintaining academic integrity — 
the head of the unit is an academic and all other staff have high academic credentials. 
This has been an important factor in ensuring academic buy-in. Yet locating the unit 
within Information Services enables close links with other key departments, such as 
Business Systems. 

The short timescale of the implementation process did cause problems for the 
development of an interface between the student record system and online learning 
environment. Concern about the rapid deployment of the MLE was expressed by some 
interviewees. Delays to other projects around the university have had significant 
impact on the effectiveness of the data transfer process within the MLE. 

From the administrators’ perspective the introduction of the MLE has great benefits 
for student administration but the benefits elsewhere are less clear. For academic staff 
the connections between the pedagogic and administrative are clearer; one senior 
academic commented “if I’m using e-learning I can look at how much [the students] are 
using the materials, when they are using it, I can then tailor the things to meet their 
needs more carefully”. 

The academic staff we spoke to view the MLE as integrating these two aspects of 
their role much more effectively. While it was acknowledged that this may not be 
time-saving there was a general appreciation of how pedagogic direction could be 
influenced by possessing a greater handle on student data. Yet this was not shared by 
administrative staff indicating that there is a perception difference between these 
stakeholders. This in turn could jeopardise the project as it is seen as lower priority. 

Key lesson: address operational connectivity at an early stage by considering how 
the MLE will operate to introduce effective systems for all stakeholders. 
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Organisational structure and change 
Although five of the interviewees had been involved at some stage in the decision to 
acquire an MLE there was no clear consensus about what the main drivers were behind 
that decision, other than a recognition that, as Collis and Moonen (2001, Ch. 2) express 
it “you can’t not do it”. Among the reasons given for this were the perception that “we 
wete quite behind and that we’ve jumped” and the notion “that we cannot fall behind in 
this arena”. Additional reasons were an acknowledgement of the growing and 
changing nature of student demand, including the need to address the lifelong learning 
agenda but only in one case was there a clear reference to government initiatives and 
funding as a factor influencing the decision. Two staff referred to the need to avoid 
what one called “disjointed incrementalism” with the proliferation of local initiatives 
and saw the introduction of an MLE as an opportunity to exert more central control 
over such developments within the institution, thus preserving the “brand”. There was 
some acknowledgement that there might be cost efficiencies, but there is as yet no 
mechanism for assessing this. One person commented that the decision to purchase the 
MLE was “amazingly quick”; another that the decision seemed “forced through”. 

The diverse reasons given for adopting the MLE are reflected in the lack of a clear, 
shared understanding about what a MLE is and how it differs from a VLE. The 
perception of one interviewee that some senior managers “wouldn’t really know what 
[an MLE] was” was born out by three others who admitted that they could not really 
define it, followed by the assertion along the lines that they would like more 
information. f 

Key lesson: provide a clear definition of an MLE and explain the functionality of the 
system to those stakeholders who are in a position to influence its implementation. 


System process 

Due to the tight timescale of the project, it was vital that staff within the ELU made use 
of existing institutional resources and expertise and kept up good channels of 
communication. A technical working group was established which facilitated this and 
enabled the ELU to focus on the pedagogic aspects of implementation, staff 
development, and technical issues relating to the interface. This co-operative model of 
working was highly successful for the implementation and one which should form the 
mainstay of the move towards a fully integrated MLE as it enabled a more objective 
vision to be applied to the process. 

Despite this there is some concern among academic staff that certain groups in the 
university regard this as “a technology systems implementation [project] with no 
intrinsic interest, not even willingness to be interested, in what the academic pedagogic 
issues are”. There is a need to balance the obvious technological drivers with a clear 
sense of the pedagogic benefits for the introduction of a large scale e-learning project. 
Communicating the advantages of using e-learning can be problematic as it is 
automatically associated with the introduction of technology. In addition, further 
challenges were presented by the fact that not all the technological infrastructure of the 
university was ready for this scale of initiative. As one senior administrator observed, 
the university has a considerable number of JT projects in development and is 
considering “a way of evaluating all these projects [and] deciding on the priorities of 
them” but, it “is at a very very early stage in that process”. Where will the MLE fit in 
order of priority? And as one academic argued “the physical classrooms have [still] got 














to be very good if you are using a VLE because [. . .] you have to be able to show it to 
students routinely [so if] all these things are not right in the physical sense, you cannot 
use the VLE as a natural part of your face to face lecture”; in turn undermining the 
pedagogic value. 

Key lesson: institutional e-readiness is vital for the introduction of e-learning 
successfully. 


Professional development 
The challenge at City is how to implant the MLE into the consciousness of staff, 
particularly academic staff. Among these interviewees there was a widespread 
perception that there is insufficient support for staff development in the use and 
applications of the MLE. As one expressed it: “part of the issue which hasn’t been 
addressed is actually the skill set of academics and support staff across the institution 
in order to take [the MLE] forward”. As far as academic staff are concerned the view 
was expressed by four of the respondents that such development should be delivered 
as part of the wider teaching and learning strategy, but whether this will be achieved 
through recruitment, resource provision in terms of time and/or money, the promotion 
of e-champions in the schools or some combination of these was another area where 
opinions diverged. With competing demands and pressures, lack of time to engage 
with staff development activities has been a key factor in the uptake or lack of it of staff 
development sessions. And there is a difference in opinion between staff on how to 
rectify this. For example, one academic significantly involved in e-learning stated 
clearly that she felt “a proper champion system so that we can reward people even if 
it’s on a very small level for being involved” is vital to the success of the 
implementation. But others disagree, another academic maintained that “if the reward 
is a small amount of money or a small fellowship or a scholarship for something, it 
won't make people do it”. They will only “do it because they’re interested”. And 
furthermore, an administrator stated “I would see it as part of somebody’s job to keep 
up with IT development and if you reward people for it, it might give the wrong 
messages”. So how do we integrate professional development with e-learning into an 
environment where academic staff may be sceptical of the benefits of such 
engagement? This is a question which has not been resolved. However, what has been 
a considerable success is offering staff development to all members of staff, regardless 
of status, and in a flexible mode of delivery. 

Key lesson: be responsive and listen to the needs of your staff, while maintaining a 
core level of competency to ensure standards are maintained. 


Strategic vision and perception 

Where to take the e-learning initiative now has been the subject of very different 
responses. While there was general agreement among the interviewees that there was 
no clear strategy for the implementation of the MLE, there were divergent views as to 
whether this was problematic. On the one hand the view was expressed that 
“ownership has to be from all parties involved [. . .] learning information resources are 
crucial, academics are absolutely crucial” while recognizing that “administrative staff 
are probably on the fringe”. In contrast is the view stated by four of the interviewees 
‘that the roll out of the MLE should be driven primarily by the teaching and learning 
strategy with input from other areas as required. There was also disagreement as to 
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whether there should be an overarching e-strategy, one view being that “what you need 
are teaching and learning strategies, research strategies and so on, and they have 
e-dimensions”. Set against this is the view that “the e-strategy [should be .. .] slightly 
wider than the teaching and learning component”. To some extent these differences 
can be mapped to the differences of understanding about what an MLE is. 

While most people interviewed regarded e-learning as significant in terms of 
general technological progression by the university, there were different levels of 
agreement as to where this would end. Some staff expressed the opinion that you could 
not afford not to embrace the opportunities inherent with MLE implementation: “what 
people don’t realise is that this is not a choice that we have, it’s not a question of 
deciding whether to do this or not, it’s that we have to do it at some level”. Others were 
more cautious, “I think everyone will use it but I think some will be faster than others 
in taking it on board” and regarding uptake as developing no-further than using the 
technology for a document repository. 

There was a definite sense of agreement from both academic and administrative 
staff that the university needed to consider e-learning as a business venture. One 
academic maintained that while “you can separate to some extent the teaching and 
learning strategy [. . .] you also need a higher level business strategy that actually then 
keeps these three things [...] until we can have a strategic approach, e-learning is 
rudderless”. This was echoed by administrative staff who saw the e-learning process 
as part of a new strategic direction for the university. It was agreed that the university 
was at an early stage in the change cycle, but the time was right for a greater strategic 
direction to be communicated. 

Key lesson: clear strategic directions for implementing e-learning and integrating 
systems are required, not just in terms of an e-strategy, but making e-learning integral 
to all strategies. 


Conclusion 
These six key lessons relating to the implementation of e-learning and the MLE have 
been brought to light through the research interviews with key decision-makers at 
City. Examining the strategic vision of influential figures in senior management in 
relation to City’s overall commitment to improving its online business systems and 
particularly how e-learning connects with this vision has provided valuable insight 
into the process of institutional change within the university. One of the most 
significant features of all these interviews is the recognition that institutional change is 
a complex evolutionary process. While the establishment of the ELU and rollout of the 
MLE system could be regarded as instances of a revolutionary “big bang” for the 
university; most of those engaged with the process view this as the beginning of a 
longer transitional period. Although City has undoubtedly undergone some major 
change in e-learning provision since the summer of 2003, the ELU and other staff 
involved with creating the MLE are still working within existing parameters 
concerning organisational structures, funding mechanisms and perceptions. This 
therefore limits the revolutionary impact of the “big bang” of summer 2003 and sees 
the advent of a “bedding down” and quieter integration period. 

There will be a number of challenges ahead, but the main one for staff involved with 
e-learning at City will be to keep the momentum going by maintaining that enthusiasm 
of early adopters while convincing and engaging more sceptical staff. And through all 





this the university will have to communicate a clear sense of strategic direction and How to manage 


commitment. Yet one of the most positive aspects of this process has been the 
establishment of effective communications between hitherto disparate elements in the 
organisation and a greater shared sense of ownership of the process of change in the 
organisation. The over-arching message that came out of all the interviews we carried 
out was that the e-learning process and this research has made senior decision makers 
reflect on their role within the institution and the role of the MLE in this cycle of 
change. And this is perhaps one of the most truly revolutionary, unanticipated 
outcomes of the e-learning initiative. 


Notes 


1. When drawing the distinction between “virtual learning environment” (VLE) and “managed 
learning environment” (MLE) we are referring to those definitions as cited in the JISC (2002a) 
which defines a VLE as referring “to the ‘online’ interactions of various kinds which take 
place between learners and tutors” and an MLE as “the whole range of information systems 
and processes of a college (including its VLE if it has one) that contribute directly, or 
indirectly, to learning and the management of that learning”. 

2. These initiatives included the redesign of the University’s degree programme offering into a 
standard credit-rated module framework with clear learning outcomes and objectives; the 
rolling out of a content management system to ensure web uniformity and enhanced 
resource management; the upgrade of the student record system to include centralization of 
the assessment, award and progression process; new assessment regulations; and finally the 
vision of integrating all university systems into a student portal. 
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Abstract 

Purpose — To provide a review of the interface between e-learning, digital libraries and learning 
content. 

Design/methodology/approach — A review of current thinking and activity surrounding the 
delivery of content in e-learning systems. Some analysis of information concerns and commentary on 
future scenarios. 

Findings — The paper investigates the reality of information management in e-learning practice. It 
looks at types of information extant in systems and analyses links between (virtual) learning 
environments, digital libraries and web content. It examines the potential for reuse of material in a 
university context and the supporting standards and technology. 

Research limitations/implications — Looks particularly at UK and US context but also has an 
international dimension. 

Originality/value — It brings together a disposable set of issues to provide a discursive but practical 
summary of the topic. It will be of value to an information manager faced with managing content ina 
learning organisation. 

Keywords Computer based learning, Learning, Information management, Open systems 

Paper type Conceptual paper 


E-learning now 

E-learning is an ill-defined concept, subject to wide variation in practice, but which 
nevertheless has become an established component of education delivery worldwide. 
At one extreme it implies the use of the web technology to facilitate the whole cycle of 
learning from initial sign-on to final certification, with a range of operations in 
between, and with no, or little, physical interaction with the host university. This 
replicates the distance learning model and has parallels with the operations of distance 
learning universities, which sprang up in the 1960s, though they too had earlier roots in 
the external degrees of the major UK universities. At the other extreme, and much more 
commonly, e-learning in many university and college contexts is a hybrid of 
“traditional” face-to-face teaching, with electronic delivery of content and services built 
on and, where appropriate, with administration and related tasks also being web based 
— so called blended learning, in a mixture of the old and new. 

It has also been correctly described as a process and not as a technology or a 
product. But to enable these interactions, generic systems have been developed virtual 
learning environment (VLEs) in the UK, and learning management system (LMS) in the 
USA which provide a technological, parameter driven framework to allow individual 
academics to develop and deliver learning content, to interact with students and to 
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facilitate open discussion. They will also generally support a range of administrative 
functions relating to the course. In the UK, at least, the VLE concept has been further 
enlarged to encompass other institutional functions such as student house-keeping, 
bursary, timetables and so on, leading to the concept of a managed learning 
environment (MLE). The MLE remains a novel and somewhat elusive concept, while 
VLEs have become established as full-blown commercial products with industry level 
support, regular software upgrades and product enhancement. They are a staple 
function in many universities. 

However, the pedagogic aspects of e-learning are perhaps less well understood and 
appreciated than the IT which underpins it. Again, at the extreme, implementations of 
VLEs have sought to transpose the methods of traditional learning and teaching into 
the web domain so that existing learning materials (content) or citations to printed 
materials are delivered through the web equivalent of course notes, hand-outs, and the 
like. Students are directed to assimilate that material and undertake exercises based on 
it. The VLE can also be used to interact with students to provide a level of online 
support. 

This model is perhaps the most common, endemic in universities at the moment, but 
it could be argued is a poor use of the potential for e-learning (Stiles, 2000). At the other 
end of the spectrum is what is referred to as “content-free learning”, which essentially 
implies a communal approach to learning whereby students are facilitated to interact, 
investigate and improve mutual understanding. This more radical approach has some 
parallels with the development of knowledge communities supporting scientific 
communication and does not pre-suppose any given text, albeit the interactions 
themselves could, in due course, result in a knowledge base capable of being stored, 
searched and exploited. 

Indeed, to understand the role of content in the VLE we really need to understand 
the pedagogic processes which apply: 


There are a range of learning theories and learning processes in contemporary education 
informed by a variety of theorists and encompassing a variety of different forms and 
methods. Contemporary learning theories provide guidance (to e-learning development) 
which can extend beyond the surface learning which appears to be characteristic of the 
transmissive modes of teaching that are associated with conventional courses (Oliver, 2004). 


Content is most prominent in “behaviourist” learning, which is characterised by 
knowledge transmission and acquisition, equates with traditional lecturers and is also 
perhaps the easiest to create within the VLE. Other learning processes are less content 
dependent, perhaps implying more interactivity and engagement: learning by doing. 
We can differentiate between learning which is essentially about the acquisition of 
knowledge, and learning which is about making sense of things and interpreting and 
understanding reality in a different way. It is the difference between knowing “that” 
and knowing “how” (Ryle, 1949). Ultimately it is the idea of social learning which 
hinges on social interaction, so called learning communities or communities of practice. 
Smith (1999) also quotes Wenger, the apostle of situated learning, whereby learning is 
not seen as the acquisition of knowledge by individuals so much as a process of social 
participation “The nature of the situation impacts significantly on the process.” Here 
content might play no part at all, albeit the consequence of the activity might, in itself, 
be the creation of knowledge — though whether of any contemporary value will be 
addressed later. 





A further analysis is provided by Maccoll (2001) who quotes Mason (1998) in Information 
identifying three different approaches to e-learning design: management 


(1) Content and support — which, in essence, is the traditional model of delivery 
whereby content is static and central to learning and backed up by conventional 
or off-screen support. 


(2) Wrap around — which implies higher leveis of interaction with the content itself 159 
and which, in turn, may become more dynamic. It equates with the cognitive 
learning theory — the “knowing how”. 
(3) Integrated — which employs a “community of learning” approach whereby 
assignments become collaborative and support is mutual, leading to the 
possibility of changed roles (students as teachers/teachers as students) and the 
creation of new knowledge. 


E-learning content 
So, in order to address the role of information management in this new, emerging 
educational landscape we need to examine the nature of the learning content implicit in 
the above models. Content itself could be said to form a spectrum including at one end 
highly structured pre-existing traditional published material, to a loose association of 
ideas within a loosely structured knowledge base at the other (see Figure 1). 

Content could be categorised as: 


* published, structured and quality material such as library content and similar 
works within an established quality framework; 
* less structured material such as course notes, handouts and the like which may 


vary lecture to lecture, is poorly structured and not subject to any bibliographic 
controls; and 
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* totally unstructured material which might emerge from discussion fora or e-mail 
lists and which is constantly changing and being amended — we need to imagine 
interactive e-mail debates or even web logs. 


Of the e-learning delivery models outlined above, the problems deriving from the use of 
traditional materials within VLEs (and “traditional” here is used also to embrace any 
source of verifiable quality — electronic or not) are now well rehearsed if not completely 
resolved. The chief problem can be simply expressed as: a student is working in the 
VLE and is recommended to read a given article which is within a licensed database; 
how can this be enabled with minimum effort and minimum confusion on the part of 
the user? Underlying this question are two further points: 


(1) The problem that some such library content repositories are dynamic, with ever 
changing identifiers (URLs). 

(2) Remote repositories may well have different security access systems from the 
main university system itself, i.e. different levels and types of authentication. 


The simplest solution to this problem is to copy across content into the VLE domain 
itself, so that there is always a residual accessible version of the published article for 
students to use. The copyright restrictions are obvious and hence, as a solution, it is 
unlikely to be universally applicable, even if fairly common. 

Much research has been given over to technical solutions to seamless access to 
content. Projects such as Angel (2004), Devil (2004) and Olive (2004) have all looked to 
establish methodologies for persistent resource links or techniques for deep-linking 
which will resolve the VLE/Library conundrum. All are viable to some extent and, 
depending on the target content, are likely to have some variation of the open URL 
standard or other federated/distributed search enquiry to broker the different content 
repositories. These problems are comprehensively detailed in a white paper published 
by the IMS/CNI (McLean and Lynch, 2003), which notes both the low level of 
interconnections between resources on the net and, even where there is interconnection, 
the sequence can be clunky and prone to failure. In fact, many or most of the resource 
repositories are “autonomously managed — they have been developed independently 
with particular service and business goals”. This will continue until such repositories 
have service levels that will allow resources to interoperate through the local article 
resolvers, and there are solutions to the interoperability of metadata standards in 
distributing the query. Such solutions may not yet be perfect, but they are viable ways 
forward. 

At the technical levels, standards have emerged for the interoperability of e-learning 
resources and the VLE developers themselves have created systems that will 
interoperate with the various resource repositories or library systems. Project Easel, for 
example, examined opportunities for cross-European searching of e-resource banks in 
order to formulate new e-learning courses. 

Similarly, the authentication/authorisation issue has been subject to research and 
development with the most prominent, current, development being Shibboleth (2004), 
which, through its very large scale backing of global companies seems likely to 
succeed. 








Locally authored learning content 

. In practice much of the material populating VLEs is as likely to emanate from 
individual academics or related developmental groups as it is from commercial or 
licensed resources — indeed, most of the literature and anecdotal evidence suggests 
that this is the most common method for delivery of content at present. The material, 
being locally owned, can be delivered within the VLE without more ado. The only 
information management issue that then arises is whether to retain and archive that 
material, and for what purpose. Very little appears to be written on the topic (Lynch, 
2002, seems the only one to have addressed it). 

There are basically two reasons why you might wish to retain locally produced 
content. The first is the potential to reuse or re-purpose material, whether within the 
institution itself or across similar institutions or consortia. The second is to provide an 
institutional archive. Dealing with the second issue first, as it is not strictly central to 
this paper, institutional repositories have been posited as a requirement for universities 
in order to preserve the intellectual record of the institution. The purpose of such a 
database could be multiple, but would inevitably encompass the normal archiving 


function for any large organisation. It would also increasingly encompass materials _ 


which otherwise might have been disseminated through other channels, or discarded. 
These might range from course notes, e-mail logs and pre-prints of scholarly articles, 
as well as the normal run of committee papers and the like. In the context of e-learning 
it makes sense to preserve the substance of an electronically delivered course in that, as 
Lynch points out (Lynch, 2002), in due course this may be an integral part of any legal 
or similar challenge which students may make to a university ruling. 

The problem at the moment with the notion of institutional repositories is that they 
potentially perform so many roles that it is difficult to see whether we are indeed 
talking about one single repository or potentially a number of repositories with 
different but overlapping and interlinking functions. For example, there could well be 
an administrative repository, a scholarly output or research repository, and a learning 
resource repository. The differences here are not so much the technological 
infrastructure as the way in which the material is held and described, given the 
different purposes to which it might be put. Perhaps the most quoted example of an 
institutional resource repository is the D-Space initiative of MIT, which has sought to 
disseminate the support material for its curricula offering, not only within the 
university itself but globally. 

One concrete example of document repositories might be course or module 
information, such as course specifications and learning outcomes and assessment 
criteria. These are likely to sit easily in a structured database accessible by course 
identifiers and link with relevant student records and administrative data. They have 
more in common with document: systems than either digital libraries or e-learning 
content, but nevertheless contain material which is frequently sought. 

A second rationale for archiving and storing learning resource materials is, in some 
ways, easy to argue — it is the simple notion that such material can be re-used or 
adapted for other institutional purposes by other course developers. It plays on the idea 
of learning objects, whereby learning content is broken down into discreet amounts of 
learning or material which can be brought together to deliver different learning 
outcomes. The re-use concept is an underpinning philosophy of much of the e-learning 
debate; it has also generated research programmes, standards work (SCORM (2004) 
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derives directly from this approach), and the potential for an open market for such 
material. 

The advocates of re-usability argue factors such as cost efficiency, consistent 
quality, rapid development and improved learning quality. The idea is that, in time, 
there will be an array of quality learning resource objects within ‘distributed 
repositories which can be searched, retrieved and re-purposed into a new course. The 
process raises issues of object inter-operability, granularity, distributed resource 
discovery and intellectual property. It has largely been concerned with the 
development of new resources rather than the discovery and reuse of existing 
resources (Oliver, 2004). It has to be said that there is no long history of the reuse of 
learning materials, at least within UK universities where academics can be very 
territorial about their curricula and support material, and the extent to which this is 
happening throughout universities, or potentially might happen, is perhaps a moot 
point. A recent report from the UK HEFCE (Glenaffic Ltd, 2004), which, in turn, was 
reporting on a sector-wide consultation on e-learning, noted that it may be that the 
approach is more applicable in a training context. 

‘There are other potential barriers to re-use in the university context: 


* To be effective, learning resource material needs high quality metadata, which is 
not only descriptive, but is indicative of pedagogic outcomes. The creation of this 
metadata is no simple task and it is unclear whether the expectation is that this 
will be done centrally, e.g. by librarians, or by individual academics themselves. 
If the latter, there is certainly a significant training requirement here and it is 
unlikely to be welcomed by hard pressed teachers. Oliver et-al (2003, p. 38) 
reports on one project where the metadata was created by contributors — “Even 
though the inclusions of the metadata was a contractual requirement for the 
developers, there were a number of discrepancies observed in the scope and 
extent of the metadata for the resources ...” 


e In searching for new material there is no evidence that academics would 
normally turn to such repositories. For example, the use of services such as the 

_ RDN remains minimal and the evaluative review of the DiVLE project (Brophy 
et al, 2003, p. 24) reported that “many academics use Google as their primary 
source of information”. 


This situation is unlikely to change quickly, though there are relevant developments 
which might presage change. For example, the availability of free quality content is 
becoming common so that academic effort might shift more towards learner support 
than the mechanistic concern about content production and delivery. The already 
mentioned decision by MIT to mount all course materials with free access is indicative 
of the direction as is the service Merlot (2004). An alternative scenario has been the 
growth of both commercial and national learning resource repositories that can feed 
the development of local interactive resources. Examples of these might be the NLN 
(2004) in the UK (this is aimed at college level more so than university level and 
provides small episodes of learning to maximise flexibility in delivery within 
e-learning programmes), the offerings of Pearson Education, material brokered by VLE 
vendors such as WebCT or Blackboard, and the outcomes of the JORUM (2004) project. 
However, the uptake and commercial viability of these services and repositories is an 
unknown and, though much investment has gone into the development of some of the 





more commercial enterprises, they do depend on the development of the more 
interactive mode of e-learning delivery than the didactic approaches initially described. 
And, as Stiles (2000) points out, there are clear contradictions here, with many seeing 
content as a future market with others proposing opposite models. An unresolved 
question in this is: “Is it an institution’s content or the educational experience it 
provides, which affects competitive advantage?” 

So, should institutions be building learning resource repositories to facilitate 
institutional resource sharing? The standards are there for this to happen and there is 
much experience to go by, but the organisational and cultural issues are perhaps less 
well rehearsed; for example, who should own such repositories? Should they be 
centrally managed with consequent efficiencies or distributed to faculties? It certainly 
requires an unambiguous institutional strategy in order to progress, as many 
academics would regard lodging their materials with an institutional server as 
peripheral, if not downright unacceptable. But if universities are to capitalise on their 
knowledge resources then the development of learning repositories is a clear necessity. 
It becomes one of securing faculty ownership, ensuring common levels of 
interoperability, and putting in place relevant reward mechanisms to get the whole 
effort moving. It contrasts with current practices identified by McAndrew et al. (2003) 
where much content is being stored within the VLE itself or in local repositories and 
migrating this to a control repository is in itself a significant challenge. The alternative 
is the use of internal harvest mechanisms or agent technologies to create virtual 
collections, which might overcome some of the resistance to centralised repositories. 

In any event, repositories can be structured along faculty lines and, as long as 
standards of metadata are in place, then inter-discipline sharing becomes possible. 
Building repositories is perhaps, in itself, not overly complex. But to have maximum 
value they will require the attention of a mixture of experts including librarians, 
information managers, learning technologists, IT staff and teachers, and the policies 
that will need to be addressed will include: 


* Rights management — unless individual academics can see some kind of return 
for their efforts, they will be unwilling to deposit content, and may prefer to go to 
a commercial group. 


* Archiving policies — Lynch (2002) makes the point that, even though a course 
may have ceased to run, there may well still be a need for an archive so as to 
comply with appeals systems, etc. 


* Promotion and exploitation — in that the individuals may well perceive 
alternative resource banks as being more appropriate, easier to adapt, and so on, 
while the local repository may well be found to be facing increasing competition 
both with commercial and consortial efforts. 


* Provision of good metadata — this also assumes that the purpose of metadata in 
this context is specifically aimed at the reuse within learning resource 
repositories, while in many ways metadata also has to serve the traditional 
function of metadata in indicating the availability and duplicability of the item 
itself — the question is, “which audience is the metadata being created to serve?” 

* Finally, although content may be potentially reusable, divergence in pedagogic 
practices can imply the need for a significant refocusing to ensure the transfer of 
outputs from one university to another. 
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Systems will also need to be easily navigable for course creators and provide quality 
outputs, including alerting systems that accurately profile user needs. 

So, the question remains as to whether we will see a number of small institutional 
learning resource repositories which may not, in effect, pay their way or whether there 
is a move to more consortial approaches or whether, in the end, the commercial sector 
will win through. 


Resource lists 
Bridging learning modes (1) and (2) described earlier are reading lists or, perhaps more 
correctly, resource lists, which, inter alia, provide direction to the student as to what to 
assimilate, but also advise libraries and bookshops and others of student needs. These 
also translate into the VLE and the digital library domain and enable the tutor to 
provide direction to core resources, whether web-based or print, to circulate these 
resources in whatever way is needed, and to direct central and support units as to what 
is happening. Reading list management systems (RLMs) are shared and distributed 
databases that can provide a direct link between library catalogues and the VLE, and 
vice versa. For preference they should be integrated with both, in effect a subset of the 
catalogue of resources identified viable as appropriate to a specific course. They, too, 
need ownership, both by tutors and by libraries to ensure currency and relevance. 
Such resource lists, in themselves, might create a dynamic metadata repository key 
to a university’s course offering. Current reading lists are notoriously static documents, 
often dated and rarely fit for purpose. The opportunity to create a dynamic, shared and 
annotated list, as a bridge between resources and learning is one which, in the end, may 
prove to be more critical than many other developments we have noted previously. 


Communities of practice 

The final model in the Maccoll-Moore analysis is the more radical e-learning scenario 
where new concepts are created through the interactions of the virtual learning 
community. This does not tend to lend itself to any obvious information management 
analysis other than a comparison with knowledge management and similar systems. In 
such a dynamic scenario it is likely that any repository will be created in real time, in 
effect a live archive of course history. 

Lynch (2002) talks about student information and published information being 
co-mingled as the outputs of this kind of learning. It seems more likely that this 
approach to learning will be either complementary to, or supplemented by, some of the 
other models earmarked so that there will be an intermixing of dynamically created 
commentary together with the texture of material itself. It is also likely that disciplines 
themselves will vary in the extent that they take up any of these different options, in 
that'e-learning does not necessarily suit every specific discipline. 


Summary 

In summary, much of the debate over this topic has been at a technical level and is 
focussed on the specific issues of ensuring interconnections between resources 
identified in the learning domain and those held in learning resource repositories and 
elsewhere. It should not be forgotten that much learning is unstructured and open, and 
there is no doubt students within HEIs will continue to need library portals to enable 
access to the totality of information resources that they require and that universities 





can supply. In the same way, tutors are unlikely to be bound by the constraints of 
institutional commercial learning resource repositories, and will continue to search in 
an ad hoc way comprehensive library collections. Thus content stretches from the 
unstructured to the dynamic and free form, while learning can be precise, directed and 
dependent on the one hand and open and content free on the other. Information 
management systems will be biased towards clear learning outcomes, heavily 
controlled and structured, while learning will always be a more complex and personal 
act, and how ever good the system will be difficult to emulate. 

We have seen that most of the research and development to date in the interaction 
between information and e-learning has been concerned with the citation of electronic 
material held within licensed databases and the utilisation of existing library 
collections. Paradoxically, most of the actual implementation work to date is probably 
more at the level of individual academics creating content for individual web sites. No 
doubt, in due course, this gap between research and practice will gradually be bridged 
as linking technologies become simpler and more commonplace, and there is a new and 
emerging generation of teachers who are more web adept and more likely to be using 
borne digital content. Whether this happens within the confines of an e-learning 
structure, such as a VLE, or whether, as is probably often the case, it is a matter of 
individual teachers pointing their group of students to their content is possibly a moot 
point. It probably depends on the nature of the institution in question and the extent to 
which it favours such corporate approaches or not. It may also be an issue to do with 
level and discipline mix. Certainly, in the recent past, much of the developmental effort 
in the UK has gone on at the sub-graduate level where management may be much more 
centralist than that at prestigious, and therefore somewhat anarchic, universities. 

Perhaps the future reality will be something very different from what we now know 
and understand. It ought to be about choosing the relevant learning process rather than 
being constrained by any given system. To repeat the point we still know very little 
about the pedagogy of e-learning and it is perhaps of no coincidence that this has 
emerged as a new action line in the UK’s national information initiatives. In the 
commercial sector e-learning is certainly being looked at as an aspect of much wider 
strategic developments which also includes corporate communications, knowledge 
transfer and other foundation stones of a learning organisation so that delivery moves 
on from the relatively “flat” learning material now extant in library databases or VLEs 
to the much wider use of sound and images to support understanding and 
communication. 


References 

Angel (2004), web page, available at: www.angel.ac.uk (accessed 11 January 2005). 

Brophy, P., Markland, M. and Jones, C. (2003), Link”: Linking Digital Libraries and Virtual 
Learning Environments: Evaluation and Review, Final Report: Formative Evaluation of the 
DiVLE Programme. Deliverable D5, Link?" Project, CERLIM Centre for Research in 
Library & Information Management, Manchester, available at: www.cerlim.ac.uk/projects/ 
linker/linker5_master.doc (accessed 1 January 2005). 

Devil (2004), web page available at: http://srvl.mvm.ed.ac.uk/devilweb/index.asp (accessed 11 
January 2005). 


Information 
management 


165 





166 


Glenaffic Ltd (2004), Responses to Consultation on the HEFCE e-Learning Strategy: A Report to 
: HEFCE by Glenaffric Ltd, May, available at: www.hefce.ac.uk/pubs/rdreports/ 
2004/rd04_04/ (accessed 11 January 2005). 


JORUM (2004), web page available at: www jorum.ac.uk (accessed on 11 January 2005). 


Lynch, C. (2002), “The afterlives of courses on the network: information management issues for 
learning management systems”, ECAR Research Bulletin, Vol. 2002 No. 23. 


McAndrew, P., Simpson, E., Scantlebury, N., Thorpe, K., Maccoll, J., Alexander, W., Ellaway, R., 

_ Low, B., Dozier, M. and Stevenson, E. (2003), Project Devil: Dynamically Enhancing VLE 

' Evaluation Information from the Library, Devil Evaluation: Report on the Ways of Working 

_ Needed to Provide Effective Dynamic Linking between VLEs and Digital Repositories, 

August, available at: http://srvl.mvm.ed.ac.uk/ devilweb/waysofwork.pdf (accessed 11 
January 2005). , 


Maccoll, J. (2001), “Virtuous learning environments: the library and the VLE”, Program: 
' Electronic Library & Information Systems, Vol. 35 No. 3, pp. 227-39. 


McLean, N. and Lynch, C. (2008), Interoperability between Information and Learning 

` Environments — Bridging the Gaps, a joint White Paper on behalf of the IMS Global 

- Learning Consortium and the Coalition for Networked Information, DRAFT version, 

28 June, available at: www.imsglobal.org/DLims_white_paper_publicdraft_1.pdf 
(accessed 11 January 2005). 


Mason, R. (1998), “Models of online courses”, Asynchronous Learning Networks Magazine, Vol. 2 
No. 2, available at: www.aln.org/alnweb/magazine/vol2_issue2/Mason.nal.htm (accessed 
11 January 2005). 

Merlot (2004), web page available at: www.merlot.org/Home.po (accessed 11 January 2005). 

NLN (2004), web page available at: www.nin.ac.uk (accessed 11 January 2005). 


Olive (2004), web page available at: www.jisc.ac.uk/ index.cfm?name= project_olive (accessed 11 
_ January 2005). 
Oliver, R. (2004), Factors Impeding Institutional Design and the Choice of Learning Designs in 
Online Courses, available at: http://elrond.scam.ecu.edu.au/oliver/2003/workshop_paper. 
pdf (accessed 11 January). 


Oliver, R., Wirski, R., Hingston, P. and Omari, A. (2003), “Exploring the reusability of web-based 
' learning resources”, in Lassner, D. and McNaught, C. (Eds), Proceedings of Ed-Media, 
' Honolulu, HI, pp. 115-22. 


Ryle, G. (1949/2000), The Concept of Mind, Chicago Press, Chicago, IL. 
SCORM (2004), web page available at: www.adInet.org/ index.cfm?fuseaction = scormabt 


Shibboleth (2004), web page available at: http://shibboleth.internet2.edu/ (accessed 11 January 
2005). 


Smith, M.K. (1999), The Encyclopedia of Informal Education (Learning Theory), available at: 
' www.infed.org/biblio/b-learn.htm 


Stiles, MJ. (2000), “Effective learning and the virtual learning environment”, in EUNIS 2000: 
Towards Virtual Universities: Proceedings of the European University Information System 
: 2000 Conference, INFOSYSTEM 2000, Instytut Informatyki Politechniki Poznanskiej, 
: Poznan, pp. 171-80, available at: www.staffs.ac.uk/COSE/cose10/posnan.html (accessed 11 
_ January 2005). 








Further reading 


Davenport, E. (2001), “Knowledge management issues for online organisations: communities of 
practice as an exploratory framework”, Journal of Documentation, Vol. 57 No. 1, pp. 61-129. 

McLean, N. (2004), Interoperability Convergence of Online Learning and Information 
Environments, available at: www.colis.mq.edu.au/news_archives/convergence.pdf 
(accessed 11 January 2005). 

Markland, M. and Brophy, P. (2003), Link®®: Linking Digital Libraries and Virtual Learning 
Environments: Evaluation and Review. Deliverable D1, Link®® Project, CERLIM Centre for 
Research in Library & Information Management, Manchester, available at: www.cerlim. 
ac.uk/projects/linker/d1_master.doc (accessed 11 January 2005). 

Roes, H. (2001), “Digital libraries and education: trends and opportunities”, D-Lib Magazine, Vol.7 
No. 7/8. 


Information 
management 


167 





The Emerald Research Register for this journal is available at A The current issue and full text archive of this journal is available at 
www.emeraldinsight.com/researchregister Wy www.emeraldinsight.com/0001-253X.htm 





F The use of the internet in higher 
education 


Academics’ experiences of using ICTs for 
168 teaching and learning 


Received 21 October 2004 Rebecca Eynon 


Revised 6 December 2004 Oxford Internet Institute, University of Oxford, Oxford, UK 
Accepted 13 December 2004 


Abstract 

Purpose — To explore academics’ experiences of using information and communication technologies 
(ICTs) for teaching and learning. 

Design/methodology/approach - Analysis of three discipline-specific focus group discussions 
held with academics based in Higher Education Institutions (HEIs) that use ICTs for teaching their 
students. 

Findings — The most common use of ICTs in all subjects was to provide students with access to a 
range of online resources. Academics’ motivations for using ICTs included: enhancing the educational 
experience for their students; to compensate for some of the changes occurring in higher education, 
such 2s the rise in student numbers and demand for flexible learning opportunities; and personal 
interest and enjoyment. The difficulties academics encountered when using these technologies for 
teaching included: a lack of time; dissatisfaction with the software available; and copyright. 
Research limitations/implications — This is a small scale, exploratory study. Further research is 
required that is sampled in such a way as to ensure that the findings can be generalized to all 
academics in all institutions in the UK. 

Practical implications — The institutional, middle managerial, staff and student level all need to be 
considered when encouraging the further adoption of new technologies for teaching and learning in 
higher education. Institutional level strategies must also account for the diversity of ways ICTs may be 
used in teaching in different contexts across the institution. 

Originality/value — Research exploring academics’ experiences of using ICTs for teaching and 
learning is scarce. Further work is required to ensure the successful development and implementation 
of future technological and policy developments in this area. 


Keywords Academic staff, Higher education, Communication technologies, Internet, Teaching, 
Learning 


Paper type Case study 


Introduction 

There has been a great deal of debate regarding the use of information and 
communication technologies (ICTs) for teaching and learning within universities. The 
potential of ICTs for higher education is well documented and has been much 
promoted by policy makers and enthusiasts within the sector. The Dearing Report 
Aslib Proceedings: New Information (NCHIE, 1997), The Future of Higher Education (DfES, 2003), and the more recent 
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Pepene s i e-learning strategy proposals developed by the Higher Education Funding Council for 
Nee England (HEFCE, 2003) are all examples of policy commitment in this area, and this 
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decade, research is required that explores academics’ experiences of using ICTs for 
teaching and learning in order to: provide a clearer vision of where it is appropriate to 
use new technologies in higher education; to develop strategies to support existing 
initiatives and encourage further adoption (where appropriate); and ensure the 
successful development and implementation of future technological and policy 
developments in this area. 

Surprisingly little is known about lecturers’ opinions on, and experiences of, using 
educaticnal technology (Steel and Hudson, 2001); though the research base is steadily 
increasing. From analysis of the available research on this topic it is clear that there are 
a range of individual, practical and cultural factors that shape academics use (and non 
use) of new technologies for teaching and learning (e.g. Selwyn, 2003). Two factors that 
are likely to influence academics’ use of ICTs in teaching and learning are the 
institutional (Clegg et al, 2003) and disciplinary (Rowley et al, 2002) contexts. Thus, 
this exploratory study set out to investigate the potential similarities and differences of 
academics use of new technologies for teaching and learning both within and across 
three disciplines, namely English, Law and Nursing/midwifery. Academics were 
invited to one of three discipline specific events to share and discuss their own 
experiences of using ICTs for teaching their students. This paper will focus on four key 
themes that emerged from each of these discussions and consider the differences and 
similarities within and across each of these groups. The four themes to be explored are: 


(1) How ICTs are being used in teaching and learning. 
(2) The motivations of academics to adopt ICTs in teaching and learning. 


(3) The difficulties they have encountered when using these technologies for their 
students. 


(4) The factors that may influence the further adoption of new technologies in 
higher education. 


These will be discussed in detail in the results, discussion and conclusion sections 
below. First, the methods utilised for the study are summarised. 


Method 
In June 2004 three discipline focus groups were held with academics from English, Law 
or Nursing/midwifery that used ICTs to teach their students. Each discussion group 
was part of a one day, discipline specific, event where staff from HEIs across the UK 
were invited to a workshop where they were presented with the findings from a 
research project that explored the use of the web for teaching and learning in higher 
education (Eynon, 2005); and were then asked to participate in a focus group to discuss 
their own experiences of using ICTs in teaching and learning. The events were 
designed to provide a greater insight into the use of ICTs for teaching and learning in 
higher education, explore the similarities and differences of academics use of new 
technologies for teaching and learning within and across the three disciplines, identify 
further areas for research, enhance network opportunities, and promote cross 
institutional discussion about the use of the new technologies for teaching and 
learning. 

Potential participants were contacted via several methods. In the first instance the 
relevant subject centres of the Learning and Teaching Subject Network (LTSN) were 
contacted, and as a result specific individuals who were likely to be interested in the 
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study were contacted, adverts were placed in the centre’s newsletters, and e-mail 
messages were sent to their members. E-mail postings were made to relevant subject 
specific web sites (for example, the CHAIN[1] network in nursing/midwifery) and 
personal contacts were also utilised. 

Seven academics participated in each focus group, and each group encompassed 
academics from higher education colleges, pre- and post-1992 universities. Five 
participants from each group were “traditional” academics who carried out typical 
teaching, research and administrative responsibilities. The remaining two members 
had slightly different roles and responsibilities within their own institutions. In each 
group a sixth member had greater responsibility for the development and 
implementation for ICTs in teaching and learning within their own department or 
school, with reduced teaching time; the seventh participant in English and 
Nursing/midwifery was an educational technologist; and the final member of the 
Law group was a librarian who also had teaching responsibilities. Participants in each 
of the three groups were involved in teaching a range of courses and programmes in 
their discipline at undergraduate and postgraduate levels. 

The focus group discussions lasted an hour and a half and were semi-structured. 
The debates were recorded and notes were made at each meeting, The tapes were 
transcribed and the resulting transcripts were analysed in accordance with the 
principles from the qualitative tradition (e.g. see Miles and Huberman, 1994) and was 
aided through the use of NUD*IST (e.g. see Boulton and Hammersley, 1996). 


Results 
In this section the four main themes that emerged from each of the focus group 
discussions are explored: 


(1) How ICTs are being used in teaching and learning. 
(2) The motivations of academics to adopt ICTs in teaching and learning. 


(3) The difficulties they have encountered when using these technologies for their 
students. 


(4) The factors that may influence the further adoption of new technologies in 
higher education. 


The use of ICTs for teaching and learning 

In all three subject areas, the most common way participants used ICTs for teaching 
and learning was to provide students with access to a range of online resources, often 
including online discussion boards; with some participants using more advanced 
multimedia to provide web casts of lecturers, simulations, and problem based learning 
exercises. The majority of participants in each focus group utilised a virtual learning 
environment (VLE). In the main, these online resources were used to enhance the 
existing learning experience for students in some way as opposed to transforming the 
way the students were taught. The use of ICTs in the way described here, that is to 
enhance existing teaching practices and to maintain the current teaching paradigm, is 
typical and can be seen in a variety of different subjects, degree programmes and 
departments (e.g. Dutton et al, 2004). 











From the discussions in each of the three groups there did not currently appear to be 
a great deal of difference in the way new technologies were being used for teaching and 
learning both within and across the disciplines. Yet slight variations in emphasis 
emerged from the discussions around how ICTs could be most valuable to students. In 
English, online resources, such as newspapers, journals, graphics and books, were felt 
to be particularly valuable as was the use of online communication between students 
and staff via e-mail and discussion boards. Similarly, in Law, online resources and 
discussions were thought appropriate to enhance understanding and knowledge about 
the subject; but there was an additional interest in using the technology for students to 
learn about, and develop, the skills necessary to become a Law professional. In this 
group there was a great deal of interest (and some use) of ICTs to create simulations to 
help students learn the more practical skills they would require (e.g., negotiation) in 
their professional practice; though how, and the extent to which, these were used 
varied depending on the level of the student and the stage of qualification. In 
Nursing/midwifery the use of ICTs were thought appropriate throughout the students’ 
programme in order to help them learn both subject knowledge and practical skills. In 
this group academics were using ICTs primarily for access to resources and 
administrative purposes; yet there was a move towards the use of ICTs for simulation 
and the introduction of more multimedia in problem based learning exercises. These 
subtle differences in emphasis are perhaps to be expected and related to the general 
differences in the disciplines: English is a “traditional” academic subject where 
students are not being prepared for a particular profession, whereas 
Nursing/midwifery students are required to learn specific, clinical skills in addition 
to academic knowledge. Law falls between these two disciplines, with the early years 
being seen as a “traditional” academic subject, moving towards a more vocational 
emphasis in the later years. 


Motivations for academics to use ICTs for teaching and learning 

In all the focus groups, the participants’ main motivation for using ICTs was to 
enhance the educational experience for their students in some way. It was clear from 
the discussions that for all subjects the decision to utilise ICTs was based on 
educational, not technological, decisions. As a member of the Nursing/midwifery group 
explained: 


I think it is important just to start with the outcomes that you want to achieve ... Then you 
work backwards to see what is the best media to achieve that; and for some it will be face to 
face in the classroom, but for some of those characteristics, or behaviours, or learning then, 
you know, a classroom isn’t appropriate. So then you find the appropriate bit of technology 
that might be able to support that. Rather than thinking, “oh, I have got a good bit of kit here, 
or I have got [name of VLE], therefore it is going to be the panacea for everything.” Well it 
isn’t, you know (Participant 2, Educational Technologist, Old University). 


A lesser theme in each of the three groups was their personal interest and enjoyment in 
using technology to benefit their students. For example, in English participants 
discussed the sense of satisfaction they obtained from the creative process of 
developing web sites and knowing the students were using and benefiting from these 
online resources. 

In all three groups, academics were using ICTs to compensate for some of the 
changes occurring in higher education. For example, in English and Law, participants 
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highlighted some of the difficulties of teaching far greater numbers of students without 
an associated rise in funding. Particularly in English, ICTs were considered useful to 
provide students with access to scarce resources, materials not available at their own 
institution or resources that were in very limited supply in the library. 

In Law and English the use of ICTs was thought to help assist with providing a 
social function for students who may feel lost in very large teaching groups through, 
for example, the use of online discussions. In addition to the increase in the number of 
students, there had also been changes to the characteristics of the students lecturers 
were expected to teach. This was a particularly apparent theme in the 
Nursing/midwifery group, where participants felt that ICTs could perhaps assist 
with challenges, such as, improving students study skills or developing their 
background knowledge of the subject. 

Also, ICTs were thought to be useful to assist the increasing number of part-time, 
geographically dispersed learners and/or students who spent a great deal of time off 
campus. This was particularly the case for postgraduate students in each of the three 
disciplines. However, members of the Law group stressed that more flexible learning 
and/or a move towards more resource based learning were also sometimes demanded 
by campus based, supposedly full-time, students as increasing numbers of paren 
were working to fund their studies. 

As a participant in the Law focus group commented: 


The make up of our students ... they don’t live on campus, they travel in, live at home, they 
constantly say — at the beginning of the year you can guarantee if they have got four days in 
the university and perhaps only one hour on one day they say, “well for a start we want to 
move that to that day,” and then say, “well, why can’t we have all our lectures banded 
together in one day and do six hours?” So to be fair to the university there are also a lot of 
students who, if you like, with customer pressure coming and saying, “we actually want all 
our teaching taught in blocks because I have got a part time job ....and therefore I won’t be 
here,” you know. Perhaps 20 years ago when I was at university I was in the university five 
days a week — my life was around university and very much now they are here when they 
have lessons, possibly to use the library, and then they are off campus (Participant 4, Law 
Lecturer, New University). 


While members of the Law group noted that such block teaching was not 
educationally beneficial for students, it was, in some cases, preferred by students for 
the reasons cited above and was also desired by the university to ease timetabling 
pressures and overcome the difficulties of accommodating all the students on the 
campus at one time. 

Institutional factors, such as decisions by the school or senior management to 
encourage the adoption of ICTs for teaching and learning were not a major motivating 
factor for academics who participated in this study. Indeed, as is clear from the results 
in the section under The staff level, academics tend not to be given time in their 
working day to pursue such e-learning initiatives and are unlikely to be promoted on 
the basis of good teaching. However, it was evident in each group that the universities 
decision to support such initiatives had alerted at least some academics to the 
possibility of using ICTs for teaching and learning and, in the case of the educational 
technologists, provided them with a job opportunity. 





Difficulties encountered by the academics when using ICTs for teaching and learning 
Overall, the discussions in each of the three groups were overwhelmingly positive. Yet 
it was clear the academics in each of the three groups had encountered some difficulties 
when using ICTs for teaching and learning. 

Lack of time was an issue for the majority of the participants in each group. 
Interestingly, this was not presented as a particular problem as the academics 
appeared to simply accept that a great deal of their activity in this area took place in 
their own time. The only concern that arose from this situation was that academics 
wanted to have more time to develop and improve the online resources they were 
providing for their students. 

As a lecturer in Law commented: 


I am probably like a lot of you, you know, the IT bit, the development, which is terrible, is 
almost added on to my other duties [general agreement from the group] it is not integral to my 
duties. I don’t begrudge that, but there is a limit therefore how much time I can spend doing it 
(Respondent 4, Law Lecturer, New University). 


A second factor that was raised in each of the three groups was dissatisfaction with the 
software the academics had access to. Virtual learning environments (VLEs) were 
raised as problematic both by members of the Law and English groups. In English 
academics felt VLEs were restrictive as only registered students could use it (thus 
preventing previous years returning to the material) and was sometimes not 
straightforward to use (e.g. to upload files). In Law participants pointed out that the 
standard VLEs on offer were too corporate and not flexible enough. 
As one participant in the Law group explained: 


[Name of VLE] is highly constraining and very, very generic and it has to be because they ... 
want to sell as much and as many, you know. But it comes back to the point people made 
earlier that these kind of generic solutions are imposed corporately and institutionally upon 
us. Whereas we are at the coalface, and it is our discipline, and we want to teach in ways that 
we want to teach, you know, and it is really annoying to have to teach in the way that [name 
of VLE] says that we have to teach. I think it is outrageous I really do (Respondent 2, Law 
Lecturer/Educational Technologist, Old University}. 


In the Nursing/midwifery focus group members felt the software had to improve a 
great deal, and similar to the other groups there was a feeling that educators should 
work more with software developers in order for programmes to be developed that 
were more suitable for their subject. 

As a member of the group commented: 


I would like people like yourselves to really think about how you can influence technologists 
to develop better technologies, you know, this is what we need for our students, build it. This 
is what, this is the kind of interaction, the kind of multimedia, we want. We do not just want 
linear discussion boards we want something more real (Participant 6, Midwifery Lecturer, 
Old University). 


A third issue that was raised by members of the English and Nursing/midwifery 
groups was problems with copyright, due to the prohibitive costs or the time it took to 
gain permission to use materials that delayed, or prevented, the development of online 
resources for their students. 
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Factors influencing the adoption of ICTs for teaching and learning 

In each of the three groups there was discussion regarding factors that may enhance, or 
inhibit, further adoption of ICTs across participant’s institutions. The discussion is 
split into four areas: 


(1) The institutional level. 

(2) The school/department level. 
(3) The staff level. 

(4) Other factors. 


The institutional strategy. Members of the Nursing/midwifery focus group stressed the 
need for an institutional, strategic vision in order for the use of ICTs for teaching and 
learning to be adopted successfully across the institution. Without such a vision, work 
in this area would typically be dispersed in small pockets across the organisation. As a 
member of the group commented: 


You have got the people on the bottom that have often got some fantastic ideas and can see 
where the market is for it, and the use; but unless you have got that strategic vision. If, for 
exemple, whether the university is going to go for profit in offering online distance courses or 
whether they are going to use technology to support their current learning I think that needs 
to be a very clear strategic decision made by the institution. If they don’t make that kind of 
decision then people will just go off and do their own thing but then needless to say you have 
the champions that are going off and developing things fantastically, you don’t bring the 
followers on behind and you don’t then have the infrastructure to support it (Participant 5, 
Nursing Lecturer, Old University). 


However, it was clear that the kind of institutional strategy implemented was central to 
the. successful and appropriate adoption of ICTs for teaching and learning. Indeed, 
there were concerns from members of the English and Law focus groups that top down 
strategies could be counter productive and have a negative influence on the standard of 
teaching and learning at that institution and the likelihood of the development of 
innovative teaching using ICTs. Such a situation was most likely to occur where 
university management stipulated that academics had to use the university approved 
VLE and had to use it in a specified way. For example, in some institutions academics 
are expected to have, at the very least, a module web site that contains the lecture 
handouts and the aims and objectives for the module. Such an approach was thought to 
be very damaging on the appropriate adoption of ICTs for teaching and learning. 
Members of the Law and English focus groups argued that university managers 
should not dictate to people about how they teach; as such methods and approaches 
may not be appropriate for that particular course or discipline. Further, members of all 
three groups were concerned that this kind of top down approach may reduce the 
likelihood of academics ever adopting or considering utilising ICTs in a more 
appropriate way in their future teaching. This was because such a prescriptive, top 
down strategy often meant that academics first contact with these kinds of new 
technologies tended to be based on technological or cost saving agendas (such as the 
availability of the technology, passing the cost of printing course booklets onto 
students, or saving lecture space) not the educational potential of these new 
technologies which would interest and motivate academics. As a participant in Law, 
whose own institution had adopted a top down approach, commented: 





The university said every module must have a [name of VLE] site and the minimum you must 
have on there is an outline of the lecture materials. So obviously, I mean if you give that 
prescription, you know, you go and look at half the Law modules and they have all got a nice 
[name of VLE] site and they have got their lecture outlines on and that is it because that is, 
you know, it is just the total wrong way to go to get people to use it. It worries me that at the 
moment it has all been kind of management based, driven down, you will do this. Rather than 
us, as you said, who are teaching it, saying well, you know, we actually don’t want that, it 
doesn’t work; we want to do it this way (Respondent 4, Law Lecturer, New University). 


The department school level. As many participants came from a devolved institution 
the more local, middle, level of management was clearly thought to be important for 
negotiating the institutional level policies to fit with the needs of the students within 
the department/school, to provide extra resources where/if necessary in terms of 
technical support or time to develop materials for staff. However, often this “middle 
level” was not supportive of the development of ICTs for teaching and learning or did 
not take a particularly strong view. This issue was discussed at most length in 
Nursing/midwifery, and to a certain extent by participants of the Law focus groups. As 
a member of the Nursing/midwifery focus group commented: 


It is the middle managers that are the block; they’re the ones who decide how much time you 
can have to develop things. Our higher echelons are really keen ... but they cannot do 
everything, it has got to be down to the individual faculties or departments. But if your 
department head is saying, which is what is happening in mine, you have got to do this 
amount of teaching and that has to be face to face rather than preparing e-material then the 
e-material is not going to be as good if I bother with it. It is not going to be as good because I 
haven't got the time for it. So somehow we've got to get, got to join the top and the bottom 
(Participant 1, Nursing Lecturer, New University). 


Similarly, members of the Law group noted how important it was that the head of 
department was supportive of the use of ICTs for teaching and learning. If so, far more 
was possible. As a member of the focus group commented: 


All pioneers of IT should become heads of department as soon as possible! (Participant 1, Law 
Lecturer, Old University). 


In some cases, where school/faculty and or department level strategies were developed 
well the “top down” and “bottom up” strategy could come together quite fruitfully. As 
a member of the Nursing/midwifery group where this had occurred commented: 


So you begin to mesh the two things — the wider college and what it is doing with the school, 
and as the team of two got embedded we won money from central teaching development 
funds so we could gradually employ extra people ... So we ended up with a team of 4/5, and 
the flexibility then to do things with people becomes much greater because you have a much 
greater resource in terms of skills ... and you get in little bits of people to create a core 
support team who will then allow people to do whatever they fee! they want to do to support 
their [teaching]... It has been terribly important to talk to other people across the college who 
are doing different things and picking up their ideas and — that has been very valuable 
(Participant 2, Educational Technologist, Old University). 


The staff level. From discussions with participants in each of the three groups it was 
clear that one factor that was inhibiting the adoption of ICTs in teaching and learning 
was the lack of value the institution placed on the development and implementation of 
e-learning initiatives. A clear example of this is the lack of time academics had 
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available within their working day to develop the necessary skills and create online 
resources for their students. A related issue that was raised by members of the Law 
and English groups was that academics were more likely to be promoted on the basis 
of good research as opposed to good teaching. A further, connected, issue was the 
ability of academics to take risks. Members of the Law group raised this issue as 
younger members of staff were less prepared to take risks compared to senior staff as 
they had to consider promotion. Younger members of staff were also more likely to be 
lacking in confidence and lacked the local, cultural knowledge required to negotiate the 
rules of the institution in order to create e-learning resources. The need to be able to 
take risks was also raised by members of the Nursing/midwifery group who argued 
that the culture of the institution needed to allow academics to innovate and managers 
had to accept that such initiatives may not always succeed. As a participant 
commented: 


If you are very much bound into a bureaucratic institution and they don’t want to take risks 
... I think it needs quite a mature leadership to allow people to take risks. It is a learning 
process. The whole thing is a continuous evolution; you won’t always get it right (Participant 
5, Nursing Lecturer, Old University). 


A further factor that may inhibit adoption that was identified by members of the 
Nursing/midwifery and English focus groups was the lack of IT skills staff possessed. 
However, members of the Nursing/midwifery group felt that a change in emphasis by 
the institution from a focus on the use of ICTs as a technical “solution” towards a stress 
on the educational potential of new technologies may assist academics to overcome 
their perceived barriers about using technology for teaching. As a member of the 
Nursing/midwifery focus group commented: 


A lot of people have perceived barriers ... A lot of people in my university persist in thinking 
they have to have some specialist expertise, they have to have some IT magic or something, 
to be able to do it and this puts them off and prevents them from seeing what you are saying, 
you know, it is just a way of enhancing my teaching. That is quite a big change in attitude to 
get across and even in a new university where the priority is teaching and learning, people are 
more interested in that, but they can’t quite believe that it is not technology driven and many 
get put off by that and they think, “oh, no, I would have to go on some big special course and 
it would take ages and I havent got the time” (Participant 7, Midwifery Lecturer, New 
University). 


In addition to valuing teaching and learning through various strategies, such as, 
promotion for good teaching and creating time for staff to engage in elearning 
activities, a further factor that may help the development and implementation of the 
use of ICTs in teaching and learning is the employment of a departmental or school 
educational technologist. This issue was discussed in the Nursing/midwifery and 


` English groups. In general, such support was considered valuable; though members of 


the English group stressed the need for the individual to have some subject expertise 
and/or an understanding of the theoretical underpinnings of learning technology as 
opposed to an individual who just had technological skills. Such an individual could 
support academics who knew very little about using ICTs for teaching and learning 
and those with far more expertise. 

Other factors involved in the adoption of ICTs. In addition to more top down, 
institutional level policies, participants in each of the three focus groups felt that they 





had a role to play in encouraging greater diffusion of the use of ICTs for teaching and 
learning; through evaluating what they were doing, teaching others to use the 
technology and demonstrating how it could be used to benefit students. However, 
members of the English and the Nursing/midwifery groups noted that innovators 
could actually be a negative influence on encouraging adoption of these new 
technologies as they may appear to be too technologically advanced. As a member of 
the Nursing/midwifery group commented: 


It is easy for people who do it to underestimate other people’s barriers. I know myself 
sometimes by saying things to people like, certain dangerous phrases like, “it is quite 
straightforward”, you know, don’t say that ... if you don’t find it quite straightforward it is so 
easy to be put off (Participant 7, Midwifery Lecturer, New University). 


A further important factor was the student experience. Members of the three focus 
groups raised issues around accessibility, both in terms of availability of computers 
and/or the level of IT skills as an issue for their students. Access to computers was 
often highlighted as a problem by members of the English focus group, but not to a 
great extent in Law or Nursing/midwifery groups. Slow download times were 
considered a problem by members of English and Nursing/midwifery groups and the 
use of passwords (such as ATHENS) at home was also considered problematic by 
members of the English and Law groups. IT skills were thought to be improving 
steadily among Law students — though members of the group felt that support should 
still be provided for the minority that required it. Basic IT and internet skills were also 
highlighted as a problem by members of the Nursing/midwifery group, particularly for 
mature students updating their skills on part-time courses. Members of the English 
focus group felt that their students needed more support to develop information 
searching skills and other research skills required to use the online resources 
effectively. A third problem that was raised by members of the English and Law 
groups was the costs that were sometimes being passed on to students when they were 
required to print their course packs as opposed to being given their own paper-based 
copy. Clearly, these and other factors that are part of the student experience need to be 
considered in order for the adoption of ICTs to be successful. 


Discussion and conclusion 
The discussion above has highlighted some of the main themes that arose from focus 
group debates with academic staff who have used ICTs for teaching their students. An 
interesting finding arising from the analysis is the high level of agreement within each 
of the three groups. Due to the very different institutional and departmental! contexts 
within which the individuals work, their different roles, and the different aspects of the 
discipline they teach, it was anticipated that there would be a great deal of difference 
amongst academics — even among those who are using ICTs to teach the same 
discipline which is not obviously apparent in the analysis here. A potential reason for 
this is the method utilised; academics have few chances to discuss the use of ICTs in 
teaching and learning with others from their own discipline, and perhaps, in the spirit 
of collaboration, academics found common themes to discuss in detail and ignored the 
nuances of their experiences in this context. 

While ICTs were being used in similar ways both within and across disciplines 
there appeared to be some differences in emphasis in the way new technologies were 
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being used, which may, in part, be explained by the vocational emphasis of the course. 
A clear motivating factor for academics in each of the three groups was to use ICTs to 
enhance the educational experience for their students in some way and to overcome 
some of the difficulties associated with teaching far greater numbers of students 
without an equivalent rise in funding. Academics were also using new technologies to 
accommodate the needs and demands of the student population. For example, 
academics were using ICTs to provide more flexible learning opportunities and to 
support students who, in some cases, had additional educational needs to the 
“traditional” student entering higher education. Intrinsic rewards were of greater 
importance to the participants as opposed to institutional rewards such as promotion 
or greater prestige within their institution; this finding is supported by other research 
in this area (e.g. Hannan et al, 1999; Eynon, 2005). However, the motivations which 
propel innovators to adopt new approaches are likely to be different from other 
academics and more “mainstream” staff are unlikely to adopt the use of ICTs for 
teaching and learning without such extrinsic benefits. 

Clearly, academics in each group stressed the need for a greater sensitivity to local 
contexts. Academics in each of the three groups highlighted the need for greater 
collaboration with software developers in order that future programmes and 
technologies would be developed that could accommodate the varied demands of 
educators. There was also a need for the institutional strategy to support the diverse 
ways that ICTs could be used for teaching and learning in different contexts across the 
institution. Certainly, academics felt they should have a greater role in shaping 
institutional strategies in this area; and a prescriptive “top down” strategy was thought 
to have a potentially damaging effect on the future adoption of ICTs for teaching and 
learning. From analysis of the discussions, there is some tension between the needs of 
the individual member of staff to develop, implement and use ICTs for teaching within 
their own contexts alongside the call for the institution to provide support for 
e-learning. A balance must be struck by the institution between providing support and 
allowing academics the space to innovate in their own particular contexts. 

Factors identified here that need to be included in institutional strategies have also 
been raised in other research on this topic. For example, the provision of resources, 
incentives for staff, training and financial investment are often highlighted in the 
literature (e.g. Taylor, 1998; Ryan et al, 2000). Further factors identified here include: 
the need to value research that explores the use of ICTs in higher education in order to 
develop an evidence based culture; to make decisions to support ICTs in teaching and 
learning that are based on educational philosophies as opposed to cost saving or 
technological deterministic agendas to enhance higher education and to convince staff 
of the value of new technologies in teaching; and to consider the student experience. 
Students may require improved access and technical support in order to effectively use 
ICTs as part of their university education (Tweddle et al., 1998; Ryan et al, 2000); yet 
there are other factors that need to be considered. Individuals will only use new 
technologies when they see them as valuable, fulfilling a useful function or purpose 
(Morrison and Svennevig, 2001). 

Indeed, while academics do need the resources and infrastructure to support their 
initiatives, the innovative process should be encouraged in as flexible a way as is 
possible; there needs to be a greater sensitivity to the needs of the individual 
department and discipline. Academics are best placed to determine where ICTs should 











be used (if at all) in teaching their students. Indeed, academics do use other 
technologies where they perceive them to be appropriate, though they remain 
pressured; thus, it is unwise to conclude that non-use is simply down to practical 
issues, such as a lack of time or institutional rewards, as there may be other good 
reasons (Crook, 2002), such as a lack of student demand and inappropriateness of 
subject matter. What is apparent is how important local context is in the use (or non 
use) of ICTs for teaching and learning and that academics need to be part of the process 
when developing future policy and technological developments in e-learning. Clearly, 
further research is required to explore the issues raised in this exploratory project in 
more detail. 


Note 
1. CHAIN is an online network for people working in health and social care, based around 


specific areas of interest, and gives people a simple and informal way of contacting each 
other to exchange ideas and share knowledge. See www-nhsu.nhs.uk/webportal/chain/ 
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Abstract 


Purpose - The paper focuses on e-learning from an information literacy perspective and promotes 
the view that information literacy education needs to play a central role within any e-learning 
initiative. The main aim of this paper is therefore to present the claim that e-learning must be 
supported by an information literacy framework to enable an effective interaction between learners, 
information literacy educators and complex information environments. 
Design/methodology/approach ~ Literature-based aralysis of the main issues covered. These 
include: the challenges generated by the proliferation of digital information and the consequent need 
for information literacy education to counteract the phenomenon of information overload; the 
comparison of the information literacy approach prompted by Australia and the USA with the 
ICT-skills approach adopted by the UK. 

Findings — Examples of information literacy framewors promoted by the Association of College 
and Resezrch Libraries and the Australian and New Zealand Institute are used to illustrate the strong 
association between the “learning-how-to-learn” model, lifelong learning and the global knowledge 
economy. The UK perspective on e-learning reveals a similar lifelong-learning agenda, although in this 
case ICT skills, not information literacy, are identified as a priority, even though the effectiveness of 
lifelong-learning competences depends on the lJearner’s ability to interact with constantly changing 
information and knowledge structures. 

Originality/value ~ The paper promotes the view that a fully-fledged information literacy 
education, based on nationally recognised standards, must underpin any pedagogical initiative 
especially in the area of e-learning which requires the learners’ active engagement with a wide range of 
information sources end formats. The paper is therefore relevant to those professionals involved in the 
development of policy and provision at higher education level. 


Keywords Information, Literacy, Self managed learning, Lifelong learning, Pattern recognition, 
Facilitation 
Paper type Viewpoint 


Introduction 
Perhaps not all academics in the UK would identify with the e-learning developments 
introduced by Twente University in The Netherlands. This institution is the first 
wireless campus in Europe to test the pedagogical and logistical boundaries of higher 
education by replacing class-based learning with a wireless network where students 
and staff “can stroll from classroom to café, [halls of] residence, and even the 
surrounding woods and parkland with their laptop and mobile phones logged on” 
(Leon, 2004, p. 8). Not surprisingly, this innovation has led to a shift “away from the 
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physical campus and a move towards network access as the heart of participation” 
(Leon, 2004, p 9). Twente University fosters an e-learning environment where 
knowledge acquisition, reflected by the “prescribed reading” approach, has been 
replaced with knowledge construction shown by the learner’s application of critical 
thinking, problem solving, and by the ability to search a range of sources in a variety of 
media. As a result, the large selection of resources at the learner’s disposal offers 
opportunities to enhance the learning experience through the “excitement of choice”. 

This paper aims to illustrate that the competences associated with e-learning come 
under the umbrella of information literacy. Therefore, the view presented here rests on 
the premise that e-learning needs to be underpinned by information literacy skills to 
foster independent learning, predispose the students towards a lifelong-learning 
attitude, and equip them with the ability to make informed decisions to deal effectively 
with information overload. Without information literacy skills, the rapid proliferation 
of sources becomes a burden, and a hindrance which can lead to the alienation of the 
learners. This problem was first identified by Oberman who warns that, “Students 
unable to cope with the overwhelming number of choices [...] will be further 
disenfranchised from the information infrastructure.” (Oberman, 1991, p. 200) 

The paper first outlines the information literacy framework through literature from 
some of the major information literacy promoters, such as ACRL[1] and ANZIIL[2}. 
These perspectives will be used to support the claim that information literacy is the 
foundation of lifelong learning, and therefore it is an essential component of any 
e-learning strategy. Implications of implementing an information literacy policy will be 
explored from the point of view of the information literacy tutor or educator working 
within the Higher Education environment. This term describes either an academic 
librarian responsible for user education activities, or a faculty member of staff 
delivering information literacy programmes at whatever level. Finally, the UK 
perspective on e-learning will be presented through a brief evaluation of the 
Department for Employment and Skills (DfES) consultation paper, Towards a unified 
e-learning strategy, published in 2003. Examination of this paper reveals that the move 
by the British Government towards this innovative educational environment aims to 
foster the type of lifelong learning skills associated with the information literacy 
approach. However, a comparison between the UK elearning model and the 
information literacy education initiatives in other English-speaking countries 
illustrates a need for the UK to implement a more cohesive information literacy policy. 


Information literacy: foundation of e-learning 

Although a range of interpretations of the term information literacy has been 
developed by educational institutions and professional organisations alike most of 
these are likely to have derived from the definition originally produced by the 
American Library Association (ALA) in 1989: 


To be information literate, a person must be able to recognize when information is needed and 
have the ability to locate, evaluate, and use effectively the needed information. Producing 
such a citizenry will require that schools and colleges appreciate and integrate the concept of 
information literacy into their learning programs and that they play a leadership role in 
equipping individuals and institutions to take advantage of the opportunities inherent within 
the information society. Ultimately, information. literate people are those who have learned 
how to learn. They know how to learn because they know how knowledge is organized, how 





to find information, and how to use information in such a way that others can learn from them. 
They are people prepared for lifelong learning, because they can always find the information 
needed for any task or decision at hand (American Library Association, 1989, p. 1). 


The association between the learning-how-to-learn perspective and the 
lifelong-learning initiative is strongly promoted by the information literacy 
frameworks produced by both ACRL and ANZIIL. The latter offers a diagrammatic 
representation of this relationship, where information literacy is placed as a 
sub-category of independent learning which, in turn, is a sub-set of lifelong learning 
(see Figure 1). 

This model makes information literacy education “...consonant with reform 
agendas in government, in communications technology and in education ... [and]... 
with employers’ demands for an adaptable and responsive workforce ...” (Bundi, 
2001, p. 7). In other words, it is seen as a response to the challenges posed by lifelong 
learning. The link between information literacy and lifelong learning is also perceived 
as an economic enabler and therefore highly valued. For example, the OECD’s (1996) 
report on The Knowledge-Based Economy stresses the importance of the ability to learn 
in order to fulfil increasing demands for highly skilled workers: 


The knowledge-based economy is characterised by the need for continuous learning of both 
codified information and the competencies to use this information. As access to information 
becomes easier and less expensive, the skills and competencies relating to the selection and 
efficient use of information become more crucial. Capabilities for selecting relevant information, 
recognising patterns in information, interpreting and decoding information as well as learning 
new and forgetting old skills are in increasing demand (O'Sullivan, 2002, p. 8). 


“ 


At a social level Bruce associates people’s empowerment with the ability to operate 
within complex information environments: 


People to function effectively in their personal and professional lives need to understand and 
interact with an ever changing information environment. Recognition of this need, often 
described in terms of empowerment of the individual, contributed further to the emergence of 
the concept of Information Literacy (Bruce, 1997a, p. 3). 


Ñ 


lifelong ` 


learning $ 





Source: Bundi (2004, p. 5) 
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Information literacy: a new learning culture i 

The transformation of the learning and teaching environments generated by e-learning 
strategies requires substantial changes in the methods of provision. Bruce sees 
information literacy as the main vehicle driving this educational change and presents 
the view, initially put forward by Breivik (1998), that “the traditional methods of 
lectures, textbooks and even artificially constrained multimedia resources” (Bruce, 
2002, p. 5) must be replaced by conditions that foster active learning and the use of real 
world information resources: 


Education needs a new model of learning [. ..] that is based on the information resources of 
the real world and learning that is active and integrated, not passive and fragmented [. . .]. 
Textbooks, workbooks and lectures must yield to a learning process based on information 
resources available for learning and problem solving throughout people’s lifetimes (Bruce, 
2002). 


Bundi fully supports this view: 


The creation of a learning culture which produces graduates with a capacity and desire for 
lifelong learning in a rapidly changing, complex, and information abundant environment, 
requires a major shift in the educational paradigm (Bundi, 2001, p. 5). 


The challenges generated by the digital environment have led to a rapid increase in the 
amount of information available. This has made users more “vulnerable to 
misinformation” (Lichtenstein, 2000, p. 23) and information literacy education 
provides the high-order literacy required to deal with this problem. Unlike the general 
definition of literacy, which involves basic competences in reading and writing, 
high-order literacy according to Lichtenstein “[...] goes much further and entails the 
ability to make inferences from material, formulate questions and develop ideas” 
(Lichtenstein, 2000). He quotes the ACRL’s (2000) Information Literacy Standards to 
justify the implementation of information literacy education: 


Because of the escalating complexity of this [information] environment, individuals are faced 
with diverse, abundant information choices in their academic studies, in the workplace and in 
their personal lives. Information is available through libraries, community resources, special 
interest organisations, media and the Internet and increasingly, information comes to 
individuals in unfiltered formats, raising questions about its authenticity, validity, and 
reliability. In addition, information is available through multiple media, including graphical, 
aural, and textual, and these pose new challenges for individuals in evaluating and 
understanding it (Lichtenstein, 2000, p. 25). 


This generates a number of challenges for the information literacy educator. For 
example, librarians have seen their role change substantially and Lichtenstein 
attributes this to the fact that they no longer control access to information, given that 
print has lost its monopoly as a format for information storage. Such a view is 
confirmed by Buchanan et al. (2002) who describe the impact of the digital environment 
on the process of learning and the consequent effect on information literacy provision: 


Increased access to technology has altered the way that students study, while the variety of 
electronic information resources has widened the potential resource base for all students. 
These developments have reduced face-to-face teaching in the library and the need to visit the 
library building for help. It has also meant that librarians need to alter the way they plan and 
deliver information literacy instruction (Buchanan et al, 2002, p. 146). 





Although the literature reflects on the practices of librarians, the changes in the 
students’ study behaviour have an impact on IL educators in general as they are 
encouraged to assume the role of “professional information navigators ... who can 
train people to be information literate and operate effectively in our contemporary, 
complex and largely unregulated information environment” (Lichtenstein, 2000, p. 24). 

Moreover, Ford (1995) argues that in the information age effective information use is 
characterised by “... people who can tease knowledge and understanding out of large 
information flows. They will be pattern finders, applying new intellectual skills and 
working with more powerful information tool.” (Ford, 1995, p. 99) Ford’s definition of 
users reflects the need to implement the learning-how-to-learn approach advocated by 
information literacy and this, in turn, requires the information literacy educator to 
become a facilitator of learning. Such a shift is described by the metaphor “from the 
sage on the stage to the guide on the side” (Doherty et al, 1999, p. 9) where the tutor 
encourages the exploration of the discipline through problem-solving activities that 
promote students’ interaction with subject specific information sources. 


e-learning: a campus-wide endeavour 

Martin comments on the development of an e-literacy policy and argues that as 
traditional face-to-face delivery is replaced by the virtual learning environment 
generated by the e-learning approach, students’ independent learning and information 
literacy skills become essential: 


Students will have to be taught how to manage their own learning processes to an 
unprecedented degree. They will have to learn to swim in a sea of information to use the rich 
resources of a supportive learning environment, to self-pace and self-structure their 
programmes of learning (Martin, 2003, p. 9). 


Buchanan et al. (2002) go further and advocate the need for close collaboration between 
library and faculty staff in order to ensure the integration of information literacy 
education within this virtual environment. They define the virtual university as “[...] 
an institution, or a set of institutions, engaged in a delivery of degree granting 
programs in higher education, using technology and methodology outside a traditional 
classroom” (Buchanan et al, 2002, p. 146). 

To ensure full integration of information literacy into the e-learning strategy 
Lichtenstein (2000) also argues in favour of collaboration between library instructors 
and faculty staff so that “students will see greater purpose in [information literacy] if 
the skills they are being taught are co-mingled with assignments from their regular 
college courses” (Lichtenstein, 2000, p. 30). 


The UK e-learning strategy 

In the consultation paper produced by the DfES e-learning is described as learning 
supported by interaction with information and communication technologies (CTs). 
The DfES claims that this new learning environment promotes the following changes 
in the educational scenario: 


e-learning exploits interactive technologies and communication systems to improve the 
learning experience. It has the potential to transform the way we teach and learn across the 
board (DfES, 2003, p. 9). 
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This e-learning vision for a future education system consists of specific targets that are 
fully explored in the DfES paper. Here we will concentrate on two of these targets as 
they best illustrate the link between e-learning and features associated with 
information literacy, namely the lifelong-learning initiative and the development of an 
education system that addresses the requirements of a global knowledge economy. The 
DfES argues that e-learning offers the opportunity for: 


* the empowerment of learners through more active learning and ultimately the 
creation of “a professional workforce and fulfilled citizens” (DfES, 2003, p. 16) 
through the mastery of self-directed lifelong learning practices; and 


* the development of innovative provision geared to address the needs of a global 
knowledge society and the offering of a more flexible education system that 
responds to the needs of learners irrespective of their location. 


The literature illustrates that both of these goals must be underpinned by information 
literacy education. For example, the contribution of information literacy to the global 
knowledge society is made explicit by the OECD statement discussed earlier, while the 
issue of empowerment through learning was identified by Bruce as a promoter of the 
information literacy education. Learners’ empowerment has also emerged as a major 
theme in the study of the impact of an information literacy module on undergraduate 
social sciences students at London Metropolitan University (Andretta, 2005). Although 
the DfES paper identifies information literacy as a necessary e-oriented skill, it is the 
area of ICT that is specifically singled out as “a priority for the new skills strategy” 
(DfES, 2003, p. 37). This emphasis on ICT, at the expense of information literacy 
competences, has been criticised by a number of authors (The Big Blue, 2001; Town, 
2003) who claim that the UK is falling behind when compared with the national 
information literacy initiatives promoted by other English-speaking countries such as 
the USA and Australia. 

In these countries the challenges posed by the information society and the global 
knowledge economy are met by fully-fledged information literacy strategies whose 
implementation is based on the close collaboration between information professional 
organisations and HE accrediting bodies. In 2000, for example, the ACRL’s Information 
Literacy Competency Standards for Higher Education were fully endorsed by the 
American Association for Higher Education, thus ensuring a comprehensive and 
cohesive information literacy strategy in this sector. Similarly, in Australia the report 
on Developing Lifelong Learners Through Undergraduate Education, produced by 
Candy et al. (1994), outlines a profile of graduates equipped with information literacy 
skills required to perform competently within their professional capacities and as 
members of the community. In other words, graduates must be capable of employing 
“skills and strategies to locate, access, retrieve, evaluate, manage and make use of 
information in a variety of fields, rather than with a finite body of knowledge that will 
soon be outdated and irrelevant” (CAUL, 2001, p. 14). 

The DfES claims that the e-learning environment promotes significant changes in 
the educational scenario: “e-learning exploits interactive technologies and 
communication systems to improve the learning experience. It has the potential to 
transform the way we teach and learn across the board.” (DfES, 2003, p. 9). Such a 
transformation, however, is not seen as the replacement of face-to-face delivery and the 
role of the tutor with an ICT-based learning environment. On the contrary, what is 








envisaged is a complementary role played by e-learning to enhance learning and 
teaching activities that take place in the classroom. This perspective deviates from the 
information literacy approach presented earlier where substantial changes are 
advocated both in terms of a new pedagogic paradigm and in the role of the tutor as 
facilitator of active learning to ensure the full implementation of the lifelong-learning 
agenda. 

The e-learning strategy proposed by the DfES, therefore, can only transform the 
process of learning (and teaching) if an information literacy policy is implemented. 
Such a policy, according to Booth and Fabian (2002), should be fully reflected in the 
national learning agenda and should be promoted by information professional bodies 
as well as by accrediting organisations. At an institutional level, information literacy 
education should also be presented as a campus-wide educational goal which is 
characterised by student-centred learning, supported by tutors as facilitators of the 
learning process (Doherty et al, 1999), and underpinned by greater collaboration 
between library and faculty staff (CHE, 1995; Rader, 1995; Snavely, 2001). 

Twente University’s wireless network is the ultimate example of what e-learning 
can potentially offer: “Gone are the banks of computers that blight the vista and cut off 
social contact to reveal open spaces constantly reconfigured to match the occasion.” 
(Leon, 2004, p. 8). Such a level of flexible provision promotes a model of e-learning 
which is revolutionary not only in its innovative use of ICT, one of the DfES priority 
skills, but also in its exploitation of social space to foster personal development 
towards the ultimate goals of becoming effective independent learners and information 
consumers: 


Technology is also shifting students’ learning experience from simple knowledge acquisition 
to knowledge construction through working together, problem solving and getting solutions 
they did not know in advance (Leon, 2004, p. 9). 


In this case, the knowledge construction model promotes the view that new knowledge 
is constructed through the learner’s individual engagement with the information. King 
(1993) argues that the process of interaction between what the learner is exploring and 
what he or she already knows generates new meanings. In addition, she claims that the 
constructivist model also enhances retention and enables the learner to develop 
complex cognitive structures where new information is fully integrated with existing 
knowledge. One of the most effective ways of constructing knowledge is to engage the 
learner in problem-based and resource-based activities (Bruce, 1997b). The certificate 
level information literacy module at London Metropolitan University, for example, sets 
a number of information-seeking tasks that students need to complete. During this 
process students have to engage with interactive tutorials that help them manipulate a 
variety of information systems, and to exert critical and evaluative skills in order to use 
or filter out the information according to the set task. As students’ information literacy 
skills are normally low on entry (Andretta and Cutting, 2003), the tutorials are 
structured on a step-by-step basis so that students can integrate the new knowledge at 
a pace which suits their learning needs. This practice has produced results that confirm 
King’s claims of increased retention of what the students have learned. Evidence shows 
that they not only become competent information consumers, but also apply their 
newly developed information literacy skills to different problem-solving conditions 
that go beyond the boundaries of this module (Andretta, 2005), Without the 
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knowledge-construction approach, IL educators will continue to struggle to break free 
from the class-based, face-to-face model of delivery defined entirely by fixed 
class-timetabling schedules, the lecture/seminar or lecture/lab paradigms, as well as 
compulsory attendance, and where the principle of passive knowledge acquisition, 
rather than active knowledge construction, is the dominant goal. 


Conclusion 

The overall theme that has emerged from this paper is that e-learning can be 
innovative and rewarding as long as the learners are equipped with the necessary 
independent learning skills needed to take advantage of it. Information literacy 
education is well positioned to develop these skills thanks to its learning-how-to-learn 
framework which is fully articulated in the information literacy standards devised by 
both ACRL and ANZIL. There is little recognition of the learning-how-to-learn or the 
knowledge construction approaches in the DfES’ e-learning strategy, and these 
omissions should be addressed through the development of an information literacy 
policy that is embedded in the UK national learning agenda. 


Notes 
1. Association of College and Research Libraries (ACRL), a division of the American Library 
Association (ALA), the ACRL Information Literacy Standards are available from the 
following address: www.ala.org/ala/acrl/ acristandards/informationliteracycompetency.htm 
2. Australian and New Zealand Institute for Information Literacy (ANZIIL), the ANZUL’s 
‘Information Literacy Framework is available from the following address: www.caul.edu.au/ 
info-literacy/InfoLiteracyFramework.pdf 
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Guest editorial 


The work of the Bibliometrics Research Group (City University) and 
associates 

This special issue of Asb Proceedings: New Information Perspectives brings together 
six papers on bibliometrics, or the quantitative study of publications, that have been 
written by people who have been working with the Bibliometrics Research Group 
(BRG) at City University during the years 2001-2004. The BRG was set up by the 
Wellcome Trust in order to allow the research outputs database (ROD), which had been 
developed within the Policy Unit there (formerly the unit for Policy. Research In Science 
and Medicine (PRISM)), to operate more freely and attract new clients to bibliometric 
methods. During the three years of the contract, the BRG not only maintained the ROD 
and provided consultancy services based on it to the trust and its UK members, but 
also developed a separate bibliometrics consultancy for foreign clients and carried out 
innovative research. 

The BRG also welcomed a number of guests who spent time working with the 
group on their own projects. Some used the ROD, which carried the financial 
acknowledgements to the half million or so UK papers whose details had been 
downloaded from the science citation index (SCD and the social sciences citation index 
(SSC). Others used the bibliometric techniques and macros that had been developed by 
the group to apply to the SCI records. These included new methods of classifying the 
papers — by subject, by potential citation impact, and by research level. Taken 
together, these three systems allow the huge number of scientific papers — now well 
over 800,000 are recorded in the SCI each year — to be put into categories that are 
defensible and transparent. The focus has been primarily on biomedical research and 
clinical medicine, although some work has been done in other areas. The new methods 
allow subject-based analysis to take place in ways that have not previously been 
possible and have led to many reports for clients and publications in journals. 

Bibliometrics is now very much an international activity and the biennial 
conferences of the International Society for Scientometrics and Informetrics (SSI) 
bring together researchers from over 40 countries. It is not surprising, therefore, that 
the associates of the BRG included people from three other countries: Brazil, Colombia 
and India, and the results of their work with the group are presented in this special 
issue. 

The group’s first visitor, and the one who stayed the longest, was Adriana 
Roa-Celis, now Adriana Atkinson, a young Colombian who was working on her PhD 
dissertation for submission to the State University of Campinas in Brazil under the 
supervision of Professor Léa Velho. This is one of the top three universities in that 
country. It is a superb monument to the vision and organisational skills of Professor 
Zeferino Vaz who created and preserved it on an autonomous basis back in 1962 when 
the military junta was making independent academic activity very difficult in Brazil. 
Dr Atkinson has been comparing the immunology research outputs from Colombia and 
Brazil; despite Colombia’s much smaller size, its research is clearly of at least as high a 
quality as that of Brazil and is more international. 
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Dr Jacqueline Leta, a graduate of the Federal University of Rio de Janeiro (UFRJ), 
although still young has an established reputation as one of the leading 
bibliometricians of Brazil and we worked together both in São Paulo and in London 
in 2002 when she paid the Group a visit. Dr Leta, who previously worked at the 
University of São Paulo, has now had her expertise officially recognised by her 
appointment to a faculty position back at UFRJ. During the three weeks that she was 
with the group, we used her data and our techniques of subject definition to look at the 
roles of Brazilian men and women in three fields — astronomy, immunology and 
oceanography. Her paper in this issue continues this work and examines the formal 
research qualifications and the publishing habits of these researchers. 

During 2001, I also had the pleasure of making my second visit to India, and went 
back to Delhi for a conference at the National Institute for Science, Technology and 
Development Studies (NISTADS). Bibliometrics is very actively pursued in India, and 
most ISSI conferences have a large number of Indian papers and posters. Quite a lot of 
these concern Indian research outputs and whether India is keeping up with its even 
larger and more populous northern neighbour, China — mostly it is not. Dr Aparna 
Basu, whom I had met at NISTADS the previous year, trained as a plasma physicist 
and was keen to use her specialist knowledge to define a filter for astronomy and 
astrophysics using similar methods to those that we had been using in biomedicine. 
This filter would be based on both specialist journals and the title words of the papers; 
the latter are important as they bring in about one third more papers including the 
influential ones in Nature and Science. She presented our preliminary findings at the 
ISSI conference in Sydney, Australia, later that year, and we subsequently ' worked 
together by e-mail as she moved down to Bangalore and then back again to her home in 
Delhi to improve our coverage of astronomy to include ten years of data. 

The other three papers in this special issue draw directly upon the ROD 
acknowledgements data to identify and then analyse particular groups of papers. Dr 
Dwijen Rangnekar, who was previously at the School of Public Policy at University 
College, London, was interested to explore the role of “patient groups”, or 
disease-specific collecting charities, in the support of research on their diseases. He 
worked closely with two of them, the Multiple Sclerosis Society and the Parkinson’s 
Disease Society, both of whom were ROD members, to identify the papers that 
acknowledged their financial support and the grants that had given rise to them. His 
paper in this issue covers just the first of these studies and shows that a dedicated 
charity can make a big difference to the research activity in a closely defined area. It 
can also ensure that the needs of the present generation of patients are not neglected in 
the search for a cure in the long term. We have, however, found that most such 
disease-specific charities do support quite a lot of basic research as well: typically one 
third of their research portfolio does not pass through the filter used to define research 
relevant to the disease. 

Kartik Kumaramangalam is a PhD student at the London School of Economics and 
Political Science who is examining the British biotech industry and the factors that 
lead firms to success or failure. There have been some notable examples of both among 
the more than 100 firms whose support for research, either intramural or extrarnural, 
has been recorded in the ROD, which spans the 14 years 1988-2001. Could the quality of 
this research, and specifically, the amount of collaboration with academia, be a useful 
indicator to the firms’ prospects? He has looked at the almost 3,000 such papers in the 





ROD and tried to tease out the factors that lead them to be published in high impact 
journals. In some later work he has looked at the extent to which big pharmaceutical 
companies acknowledge the utility of such work through their citation of it; this may 
be a rather better indicator of whether the firms will form the alliances that are 
necessary for their success. 

Finally, this issue turns its attention to the National Health Service (NHS), the 
biggest single provider of healthcare in the world, and one of the biggest supporters of 
research either directly or indirectly through the provision of clinical facilities for use 
by other sponsors. But does this research get published in the journals that practising 
clinicians actually read so that it can influence them in their daily work? Teresa Jones, 
Dr Stephen Hanney and Professor Martin Buxton of the Health Economics Research 
Group (HERG) at Brunel University in Uxbridge have been surveying the reading 
habits of doctors in three specialties — psychiatry, surgery and paediatrics. They have 
found that they do not correlate very well with the journals in which NHS research is 
published. Clearly there are many steps between the publication of research in a 
learned journal and its being put to practical use, but it does seem that the use of 
citation impact factors as a means to evaluate biomedical research is only giving a 
small part of the total picture. Other indicators will need developing and we have been 
engaged with some of them during recent years. They include the references that form 
the evidence base of clinical guidelines, those in textbooks, and even in newspapers 
that are an increasingly important means by which biomedical research is brought to 
the attention of both policy-makers and the general public. 

All articles were referred by three people — drawn from 12 countries. Articles were 
referred during the period October-November 2004. 


Grant Lewison 
Department of Information Science, City University, London, UK 
(g.lewison@soi.city.ac.uk) 
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Abstract 

Purpose — To provide an empirical contribution to analyse the dynamics of research groups in 
knowledge production in an interdisciplinary research field in two scientifically peripheral countries 
(Colombia and Brazil). l 
Design/methodology/approach —- This dynamic is analysed in the interdisciplinary area of 
immunology through a comparative study of Brazilian and Colombian research groups. The practices 
of publication, collaborative links and patterns of acknowledgements provided the framework for this 
study. Quantitative and qualitative tools were used; in particular a bibliometric study was 
complemented with information derived from semi-structured interviews with members of the 
research communities selected. 


Findings — The bibliometric study allowed the construction of some indicators: channels of 
publication, impact of the research outputs, citations and patterns of collaboration. Also, a database 
with acknowledgements was created to identify the different actors who take part in the process of 
knowledge production. These indicators, interpreted in the light of qualitative analysis, throw 
considerable light on how the different groups work on the cognitive and social aspects of knowledge 
production. 

Research limitations/implications — This study is limited to 31 leading research groups from 
Colombia and Brazil. 

Originality/value — This paper starts to redress the situation of a lack of empirical studies in 
developing countries in the use of acknowledgements as a tool to examine formal and informal 
scientific collaboration and as indicator of accountability to funding bodies. This work provides an 
empirical contribution to policy-makers and scientific communities in the task of understanding the 
dynamics of knowledge production in an interdisciplinary area combining different approaches. 


Keywords Sciences, Colombia, Brazil, Research work, Group dynamics, Information research 
Emerald Paper type Research paper 
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Introduction 

Science is by definition a collaborative and competitive endeavour. Collaboration may 
be formal or informal, within the same research organisation, between different 
institutions in the same country or involve individuals and groups working in different 
countries. The motivations for research collaboration have been extensively discussed 
in the literature (Katz, 1994). Such factors that can be grouped as economic, cognitive 
and social have varying relative importance to explain either field-to-field or 
country-to-country differences in the rates of collaboration (Luukkonen et al, 1992). 
Collaborative work is expressed in formal and informal channels and disseminated 
through different networks (Crane, 1972). Counts of publications have been widely 
adopted as measures of scientific productivity (Price, 1963) and the analysis of 
co-authorship has been used as an indicator to track patterns of collaborative 
behaviour and to find tendencies by discipline (Beaver and Rosen, 1978, 1979). The 
. advantages and limitations of quantitative studies of co-authorship have been studied 
extensively (see Katz and Martin, 1997; Katz, 1994). Despite the limitations of this 
approach (Katz and Martin, 1997), co-authorship analysis remains a valid, if partial, 
indicator of research collaboration (Georghiou, 1998). 

One limitation of co-authorship as an indicator of collaboration is that it fails to 
capture important contributions of researchers who do not qualify as co-authors but 
without whom the research work might not have been done. This is why a group of 
studies tend to combine different approaches to examine collaborative practices in 
knowledge production (Melin and Person, 1996; Melin, 2000; Laudel, 2002). In this vein 
the “acknowledgments” made in published articles have been said to reveal important 
features of the scientific practice (Cronin, 1991; 1995; Cronin et al, 1992, 1993), as well 
as providing a source of relevant information about the interactions taking place 
within research communities (McCain, 1991). At the same time, acknowledgments are 
also an indicator of indebtedness to funding bodies, thereby disclosing important links 
between researchers-and public and private organisations (Lewison et al, 1995; 
Lewison and Dawson, 1998). Notwithstanding the contribution that this information 
can bring to the analysis of the dynamics of research communities in a particular 
discipline, so far only Velho (1985) has studied this practice in peripheral countries. 

This paper aims to contribute empirically to the understanding of the dynamics of 
research communities in knowledge production by analysing co-authorship and 
acknowledgments in the publications of research groups in an interdisciplinary field 
(immunology) in two peripheral countries. 

Immunology was selected for two reasons. First, studies on scientific production in 
Latin American countries have found the medical and biomedical fields to be strong. 
Not only have such countries built research capabilities in the health sciences 
internally, but they have also participated actively in international research networks, 
thus enhancing the visibility of Latin American science. Disciplines that stand out are 
public health, pharmacology, general medicine, medical techniques and immunology 
(Krauskopf et al, 1995; Braun et al, 1994, 1995). Moreover, studies based on databases 
that cover more regional journals have found greater diversification in health research 
with a focus on local pathologies, e.g. tropical medicine, immunology and clinical 
medicine (Meyer et al, 1995). 

Second, immunology is not only important for peripheral countries, but it also 
presents important features and is heading in the direction of becoming “big science”. 


Interactions in 
knowledge 
production 


201 





202 





By the latter is meant a scientific activity carried out by multidisciplinary groups from 
basic and applied sciences, with close interaction between researchers from academia 
and industry, both nationally and internationally, and involving large ‘financial 
resources. All such dynamics, however, take place within national boundaries, and are 
affected by health-related public policies. 

Countries were selected (Brazil and-Colombia) with the aim of making a 
comparative study where possible similarities or different situations can be presented. 
The choice is also supported by the cited studies that made important contributions in 
this area of health. Furthermore, the largest percentages of co-authorship activities are 
found in clinical medicine and biology (Narvaez-Berthelemot et al, 1992). Another 
factor in the selection of these two countries was that they had policies for the 
consolidation of post-graduate programmes in the region, particularly in Brazil 
(Narvaez-Berthelemot et al, 1999). They also had strategies for internationalisation and 
co-operation through expatriate researchers and their networks, such as Red Caldas 
(networks of Colombians abroad; Murcia and Parrado, 1998). 

The research group was considered as the unit of analysis, following a long 
tradition in the sociology of science that recognises that not only cognitive aspects, but 
also social interactions, play an import role in science (Kuhn, 1971; Crane, 1972). There 
are organisational practices, manual skills, social and material knowledge, which are 
acquired through scientific practice within the scientific tradition, and depend also on 
the cultural traditions within its institutions, country and time (Pestre, 1996). Research 
groups have their leaders who are recognised as such by the national and international 
community. It is through the leaders that we identified the groups, as reported in the 
following section. 


Methods 
Databases and web sites from national research councils and other funding institutions 
were used to identify leading research groups working in the area of immunology. 
Investigation of databases from CNPq[]1] and FAPESP[2] in Brazil and 
COLCIENCIAS[3] in Colombia was carried out and revealed only 31 research groups 
(21 Brazilian and ten Colombian). The criteria used for this selection were quantitative 
the number of articles published in the area and indexed in the science citation index 
(SCD, CD-ROM version, and qua_itative (groups recognised in the top categories for the 
national institutions). This listing of groups was confirmed by a process of peer review 
to ensure the relevance of the groups as subjects of the present study. 

Once the groups were confirmed, their scientific research output for the last decade 
(1990-1999) was downloaded from the SCI. To extract research outputs two filters were 
used: 


(1) A filter with key words in the field of immunology calibrated by an expert in the 
discipline. 


(2) A filter containing the names of the group leaders and their TAE 


The latter was developed in order to recover interdisciplinary interactions and 

production by these communities. The study database comprises a total of 844 records 
from Brazilian (77 per cent) and Colombian (23 per cent) communities under study 
(Roa-Celis, 2002). Papers extracted contained at least one of the leading researchers as 








author and were restricted to articles, notes, reviews and letters. Their citation scores 
were determined for their year of publication and four subsequent years from the SCI. 

To establish the volume and impact of publication in multiple disciplines, each 
paper was classified in three ways. The first corresponded to a major field (biology, 
biomedical research, clinical medicine, chemistry, physics and others). Secondly, the 
papers were categorised by research level (RL): 


* basic research (RILA); 

* applied research (RL3); 

* strategic research (RL2); and 
* applied development (RLI). 


Table I shows the journals most commonly used by the communities under analysis. 
These categories (major field and research level) were given for each paper according 
to the journal in which it was published and are based on the journal classifications 
created by CHI Research Inc. in the US (Narin et al, 1976). 

Table II shows the third-classification of papers: the potential impact factor value 
(PIC), which is based on the ratio between the numbers of citations received in the year 
of publication and four subsequent years to the number of papers in a given journal. 
These values come from the Institute of Scientific Information (ISI’s mean expected 
citation rates file with citation data for publications in 1994), 

Furthermore, to gain understanding about different links the collaboration, 
addresses and number of authors in the papers were analysed and classified using 
counting macros. The numbers of national and foreign addresses were classified for 
each paper. Collaborative links within the group occur with papers co-authored by the 
leader and one or more members within the group (one address within national 
borders). Papers with national links are those with more than one address within 
national borders. In the international category are those papers co-authored with one or 
more other countries but only one address within national borders. Papers with both 


Research level Clinical definition Non-clinical definition Example 


RL 1 Clinical observation Applied development Acta Tropica 

RL 2 Clinical mix Applied research New England Journal of Medicine 
RL3 Clinical investigation Strategic research Immunology 

RL4 Basic research Basic research Nature 


Source: Narin et al (1976); Lewison (2001) 


PIC Co4 range Example 

1 Below 6 Brazilian Journal of Medical and Biological Research 
2 From 6 to 11 American Journal of Tropical Medicine and Hygiene 
3 From 11 to 20 Journal of Allergy and Clinical Immunology 

4 Above 20 Journal of Immunology 


Source: Lewison (2001) 
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Table I. 
Categorisation of journals 
by research level (RL) 


Table I. 

Classification of journals 
by potential impact 
category (PIC) 
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Table M. 
Categorisation of 
acknowledgment by type 
of resource received 


Figure 1. 

Brazilian and Colombian 
scientific publications of 
research groups under 
investigation 





national and international links are classified as mixed and had two or more > national 
addresses and one or more international addresses. 

Additionally, nearly 82 per cent of the articles in the database were inspécted and 
the acknowledgements section was classified and registered in the database according 
to the categories in Table IN. This procedure allowed the determination of patterns in 
the practice of acknowledgement and revealed funding sources as well ‘as other 
researchers taking part in the process of knowledge production; they could be analysed 
by research group and by country. 

Finally, to gain an understanding of the factors influencing the elate 
behaviour of Brazilian and Colombian research groups in immunology, we conducted 
semi-structured interviews with 37 members of these research groups. There were 28 
from Brazil (15 leaders, ten other group members and three international peers); and 
nine from Colombia (seven leaders and two other peers). Quantitative data interpreted 
in the light of these interviews throw considerable light on the attitudes of the groups 
concerning the cognitive and social aspects of knowledge production. 


Results and discussion 

Published outputs and co-authorship | 

Figure 1 shows the total numbers of papers published during the last decade by the 
Brazilian and Colombian immunology research groups under investigation. This total 


Type Category Acknowledgment given for: ; 
Financial Al Financial support 
Non-financial A2 Moral support and other facilities to develop the 
project (members of social communities) 

A3 Access to protocols and reagents 

A4 Technical assistance within the discipline ' 

A5 Technical assistance outside the discipline! 

A6 s Peers’ interactive communication 

AT Preparation of manuscript 


Source: Cronin et al (1992); Cronin (1995); McCain (1991) 


o Brazil : 
Colombia} | 





Source: SCI (1990-1999) 











comprises 844 records — 650 Brazilian and 194 Colombian publications (Roa-Celis, 
2002). In both countries there is an increase in output over time. In Colombia the 
increase in government expenditures on S&T between 1992-1996 (RICYT, 2002) is 
reflected in the continuous increase in outputs for this community between 1995-1998. 
In Brazil, studies have shown a steady rise in research productivity from 1983 
onwards, becoming more intense during 1988-1991, which is linked to an increase in 
the number of papers in international co-authorship (Rumjanek and Leta, 1996; dos 
Santos and Rumjanek, 2001). 

However, the findings of the present study show that the Brazilian papers 
co-authored amongst national researchers contributed markedly to the increase in 
articles published in indexed journals, whilst international collaboration contributed 
significantly to the performances of Colombian researchers (see Table IV). Papers 
co-authored within the groups and mixed papers (national and international 
researchers) are similar in both communities under study. Papers not in 
co-authorship were insignificant — 1.8 per cent and 1.5 per cent in the totals of 
Brazilian and Colombian papers, respectively. 

The data from interviews confirmed that the majority of Brazilian groups analysed 
have been active in establishing national alliances through a diversity of national and 
regional scientific meetings, with the support of national funding institutions and in 
accordance with national policies. 

These data present similar patterns to those revealed by Narvaez-Berthelemot et al 
(1992), which showed that some countries produced more locally than internationally 
during the period 1981-1986: Brazil produced 26 per cent internationally and Colombia 
48 per cent. Our results add evidence to those studies’ suggestion that scientifically 
smaller countries have a higher percentage of papers co-authored with foreign 
institutions in mainstream publications than do those with larger outputs 
(Narvaez-Berthelemot et al, 1999). 

Figure 2 shows the extent of international collaboration expressed as percentages of 
co-authored articles. Overall, this information reveals that a higher proportion of the 
Colombian publications were written in collaboration with foreign researchers than of 
the Brazilian publications. Thus Colombian and Brazilian researchers have 24 per cent 
and 11 per cent respectively of their publications co-authored with colleagues from the 
US. The equivalent percentages for European researchers are 34 per cent of Colombian 
publications and 18 per cent of Brazilian publications. Figure 2 also reveals that 
Brazilian researchers publish very little with other groups in Latin American, African 
and Asian countries. So far, these data confirm that such regional communities are not 
part of common research networks. 

In contrast with previous analyses, the data show that for Colombia there has been a 
decrease in the amount of US co-authorship that represented 47 per cent of its foreign 
links, with particularly high rates in the life sciences fields during 1981-1986 
(Narvaez-Berthelemot ef al, 1992). Links with the European Community (EC) have 








Country None Within group National International Mixed 
Brazil (%) 1.85 22.46 43.54 19.54 12.62 
Colombia (%) 1.55 25.77 13.40 46.39 12.89 


Note: The above refer to scientific research groups under investigation, distributed by collaborative 
links 
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Table IV. 
Production in 
immunology (SCI, 
1990-1999) 
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Figure 2. 

Brazilian and Colombian 
papers in immunology 
co-authored with foreign 
colleagues 
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Notes: Spain (ES), Portugal (PT), United States (US), Canada (CA), United Kingdom (UK), 
France (FR), Switzerland (CH), Sweden (SE), Belgium (BE), Germany (DE) 


increased from those occurring during 1986-1990 (18 per cent) and those with the US 
have decreased (40 per cent) (Narvaez-Berthelemot, 1992). Links with individual EC 
member states are important in the co-authored outputs, for instance, France (FR) with 
Brazil and Spain (ES) with Colombia since the period 1986-1991 (Lewison et al, 1993). 

The percentages of inter-Latin America co-authorship continue at the low levels 
noticed by Lewison et al. (1993) during the period 1986-1990. According to the 
comments of some of the researchers we interviewed, these research communities do 
recognise the benefits of regional partnerships, but factors such as financial difficulties 
and differences in approach have made the continuation of formal alliances difficult. 
However, informal cooperation between these countries and within the region does 
exist, and there are established cross-border initiatives among scientific communities 
to increase co-authored outputs, for instance, in the framework of Mercosur countries 
(Narvaez-Berthelemot et al, 1999). 

The qualitative information gathered revealed that links with igen colleagues 

have been influenced by factors such as: 

* geographical proximity and cultural identities, especially when countries within 
the region or expatriates now living abroad are involved. Thus, sharing the same 
language is an important explanation for the larger collaboration between 
Colombia and other Latin American countries, as compared to Brazil, whose 
researchers show a five-fold smaller collaboration effort with other Latin 
American researchers; 


* research interests in thematic pathologies are more common in peripheral 
regions; and 


* postgraduates going abroad and/or the modalities of sandwich programs. 








Subject fields 

According to Table V, the Brazilian and Colombian groups analysed showed different 
preferences in the selection of journals chosen to publish their articles during the period 
1990-1999. These journals are located in a variety cf sub-fields with major percentages 
in clinical medicine (68 per cent and 72 per cent) and biomedical research (27 per cent 
and 23 per cent) respectively for Brazil and Colombia. Specifically, 31 per cent and 34 
per cent are located in the sub-field of immunology, meaning that two-thirds of the 
production in both communities is more relevant to other sub-disciplines. This shows a 
high level of interdisciplinary work, particularly in Brazil. 

These results indicate that research groups both in Brazil and Colombia tend to 
address research subjects relevant to local pathologies, as they concentrate particularly 
on tropical medicine, parasitology and clinical studies related to malaria, leishmaniasis 
and allergies. Molecular biology and microbiology are also important disciplines, as 
they have been part of the researchers’ learning traditions. 

Table V also shows that these immunologists tend to publish in a wide range of 
journals, although the bulk of their production is concentrated in just a few of them. 
During the period of the study, the Brazilian papers appeared in 182 different journals: 
55 per cent were concentrated in just 25 journals and the other 45 per cent in 157. In 
Colombia, 31 journals cover 79 per cent of the corresponding production and 40 
journals the remaining 21 per cent. The most popular journals among the Brazilian 
immunologists are the Brazilian Journal of Medical and Biological Research (nearly 12 
per cent) and the Memorias do Oswaldo Cruz (7 per cent). The Journal of Allergy and 
Clinical Immunology (11 per cent), Hygiene and Immunology (6 per cent) and American 
Journal of Tropical Medicine (6 per cent) are the preferred journals of the Colombian 
researchers. 

Our interviews with researchers confirmed that the research groups in question try 
to concentrate their research on topics that are relevant to local and regional health 
problems. Their strategy is to tackle these problems working with different specialists, 
which allows them to produce results and publish in different disciplines. 


Type of research and potential impact 

Figures 3 and 4 illustrate the correlation between the type of research (from applied 
development or clinical observation=RL1 to basic research = RLA) and the potential 
impacts of the journals where these research groups published (low impact = PIC1 to 
high impact = PIC4). Figure 3 shows that the Brazilian groups published nearly 70 per 
cent of their papers in journals of low impact despite the fact that nearly 50 per cent are 
in RL3 and RLA categories, that is, more “basic” research. The Colombians published 
nearly 45 per cent of their papers in high impact journals (PIC3 and PIC4), and also 
tended to concentrate on basic research. 

Although publishing a paper in a high impact journal does not guarantee a high 
level of citation, it does make it more likely (Lewison, 2001). In order to see how the 
potential impact of the journals influenced the distribution of citations, the numbers of 
citations to articles published between 1990 and 1996 were investigated (see Table VI). 

The results overall showed that the papers analysed present a relatively low index 
of citation. About 67 per cent of Brazilian and 57 per cent of Colombian publications 
were cited between one to ten times in the five years following publication. Although 
the number of citations is slightly higher in the citation ranges 11-20 and 21-30 for 
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Figure 3. 

Percentage of Brazilian 
articles distributed by RL 
and PIC categories 


Figure 4. 

Percentage of Colombian 
articles distributed by RL 
and PIC categories 








Source: SCI (1990-1999) 
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Colombian immunologists both communities have a similar pattern, which) is low 
citation level in the following-on literature. But this does not mean that the quality of 
work of these communities is low. It is well known that due to the large number of 
publications in this area in the international literature, it is to be expected that papers 
will receive either few or no citations (O'Driscoll et al, 1995). Other studies have 
already revealed similar results of a low range of citations for a high number of 
publications (Rumjanek and Leta, 1996). However, prior analysis in clinical medicine 
has suggested that when the cumulative number of articles published is small, impact 








Number of citations Brazil (%) Colombia (%) 
0 10.74 7.69 
1-10 67.22 57.14 
11-20 10.47 17.58 
21-30 5.51 8.79 
31-40 2.48 4.40 
41-50 1.10 0.00 
51-60 0.83 1.10 
61-70 1.10 110 
71-80 0.00 0.00 
81-90 0.28 0.00 
91-100 0.00 1.10 
>100 0.28 l 1.10 


calculations may well be skewed by a few highly cited publications (Krauskopf et al., 
1995). 

A closer analysis of our sample confirms that the most frequently cited articles were 
those published in the highest impact journals (PIC = 4). This occurred with the 
articles published in the New England Journal of Medicine, Nature Genetics, Lancet, 
Nature Medicine, Journal of Immunology, and Proceedings of the National Academy of 
Sciences of USA. Most of such papers had multiple-authorship (between 5 and 18 
authors) and contributors were from different institutions and countries, while only 
two of such articles had just two authors. 

This confirms the view that groups of small communities tend to publish in the 
highest impact international journals with the aim of attaining greater visibility and 
because they have not got available national journals that are internationally indexed 
(see Luukkonen et al., 1992). The results of the present study could add that national 
journals in low impact ranges in the international index can negatively affect the 
visibility when it is convenient to publish in them. Moreover, the data add evidence 
that, in certain fields, Latin American researchers tend to publish more in their own 
nation’s mainstream journals despite their low impact (Krauskopf et al, 1995). 


Patterns of acknowledgments 

Figure 5 shows the percentages of acknowledgements pertaining to the categories 
mentioned in Table IH above. Overall, the results confirm that research groups in both 
Brazil and Colombia use acknowledgements, in the majority of their papers, to 
recognise financial and non-financial support. 

Figure 5 also shows that there are high and similar percentages of 
acknowledgements for financial support (A1) in the papers of both Brazilian and 
Colombian research groups. However, the analysis showed that there was double the 
percentage from international funding in the Colombian papers, while in the Brazilian 
papers the acknowledgements are mostly to national funding (see Table VID. 
Moreover, the analysis reveals that there is a relationship between international 
funding and collaboration with foreign peers for the Colombians and national funding 
and national collaboration for the Brazilians. 

The national institutions most frequently listed in the type Al acknowledgements 
are government agencies supporting S&T and the health sector, especially 
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Table VI. 

Distribution of five-year 
citation scores for 
Brazilian and Colombian 
publications under study, 
1990-1996 





212 





Figure 5. 

Percentage of 
acknowledgments by type, 
Brazilian and Colombian 
publications 


Table VIL. 
Acknowledgments of 
financial support on 
Brazilian and Colombian 
immunology papers by 
type of funding agency, 
1990-1999 (%) 
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Notes: A1= Financial support; A2 = Moral support and other facilities to develop the project; 
A3 = Access to protocols and reagents; A4 = Technical assistance within the discipline;' A5 = 
Technical assistance outside the discipline, A6 = Peers’ interactive communication; A7 = 
Preparation of manuscript 


Source: Roa-Celis (2002); SCI (1990-1999) 


| 


| 
Country International agencies National agencies 
Brazil 21.7 78.3 
Colombia 44.87 55,12 


Source: Roa-Celis (2002) 
universities, foundations and research councils. References to private industry funding 
are few. For instance, in the Brazilian papers, funding from CNPq, FAPESP, FINEP, 
FAPEMIG, FIOCRUZ, and Foundation Butantan stands out, and in the Colombian 
papers, funding comes from COLCIENCIAS, the Presidency of the Republic and the 
Ministry of Health. In general, international funding came from other countries’ 
governmental agencies and foundations, as well as non-profit organisations such as 
collecting charities and foundations. Very little financial support was received from the 
pharmaceutical and other industrial sectors. The international and foreign institutions 
most commonly named in these types of acknowledgments were: 
* Special Program for Research and Training in Tropical Diseases (ONDE orld 
Bank/WHO); 
* Institut National francais de la Sante et de la Recherche Medicale (INSERM) 
(France); 


* Centre National de la Recherche Scientifique (CNRS) (France); 
* The National Institutes of Health (USA); 
* Commission of the European Communities (CEC); 








* German Leprosy Relief Association; and 
* Medical Research Council (Canada). 


Figure 5 also shows results for other kinds of acknowledgements. First, 
acknowledgement type A2 can be understood as the contribution of members of 
society at large, such as patients and those who offer moral support to advance the 
projects to the production of knowledge. Type A3 acknowledgements are made in 
connection with shared protocols, reagents and access to as yet unpublished results, 
and recognised informal collaboration from colleagues or from institutions. The next 
type of acknowledgement, A4, indicates technical assistance within the discipline, and 
is significantly recognised by both communities studied here. Technical contributors 
include postgraduate students and technicians. Depending on the tradition and 
practice of the groups, these types of acknowledgement (A3 and A4) could have been 
changed to co-authorship; if this had been agreed at the beginning of the research 
process. 

The acknowledgement type A5 mentioned technical help from people in other 
sub-fields, in respect of their contributions in statistical analysis, data processing or 
other computer skills. Acknowledgements type A6 express the recognition of peers or 


mentors by reason of previous reading and discussions that helped to advance the’ 


experiment or project, but not doing enough to warrant authorship in the paper. 
Finally, acknowledgement type A7 was the expression of recognition for informal 
discussions of the paper and linguistic review, which helped these groups to polish up 
the final product. 


Conclusions 

The study revealed that the research output of the Brazilian groups in immunology 
tends to be published in journals of low impact in comparison with the Colombian 
groups analysed. Qualitative analysis reveals that there is a tension between social and 
cognitive factors within a framework of formal and informal collaborative links. This 
affects the selection of such journals. The analysis showed: 


* a significant level of inter-disciplinarity in the Brazilian research output 
compared with more specialised and disciplinary research efforts by the 
Colombians; 


* considerable local and national collaboration triggered by networks and their 
invisible colleges for Brazilians, with high levels of consolidation of 
post-graduate programmes and a tradition of regional research themes. In 
Colombia, more international networks existed, created by sandwich doctoral 
programmes; 

* a strategy of publishing in national or regional journals indexed in the SCI by 
Brazilians, whereas the Colombians prefer mainstream international journals; 

* the existence of more national funding available for Brazilian researchers 
compared with the Colombian need to obtain greater levels of international 
funding; and 


* anegligible participation of industry in formal and informal collaboration in both 
Brazil and Colombia. 
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The results obtained revealed that the dynamics of scientific production take place 
within the framework of formal and informal collaborative links, which depend on a 
continuous tension between cognitive and social factors. These tensions inside the 
research communities, as well as between them and other social groups, are'revealed 
by a combination of quantitative and qualitative approaches, both of which need to be 
taken into account in the process of policy making. 


Notes 


1. The National Council for Scientific and Technological Development (CNPq) is a foundation 
linked to the Ministry of Science and Technology (MCT), to support Brazilian research 
(www.cnpgq.br). 

2. The State of São Paulo Research Foundation is one of the main agencies to foster scientific 

and technological research in Brazil (www.fapesp.br). ; 

. The Colombian Institute for the Development of Science and Tecanology Francisco José de 

Caldas, “Colciencias” (www.colciencias.gov.co). 
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Abstract 

Purpose — The present study aims to overview Brazilian human resources and scientific output in 
astronomy, immunology and oceanography during the last decade. 
Design/methodology/approach — Data on human resources and on scientific output were obtained 
from the Brazilian database, the Directory of Research Groups. Scientific outputs were also analysed 
from a set of journals catalogued by the Institute for Scientific Information: the 20 journals with the 
largest number of articles in 2003. 

Findings — Compared with the other two fields, the number of Brazilian researchers in astronomy 
has not grown from 1997-2002, but they are the most qualified and more than 90 per cent of them have 
a PhD degree. Most astronomy publications are in international journals and they are well cited. The 
most cited astronomy papers are on international topics, but this is not true for the oceanography 
papers. 

Research limitations/implications — These data are derived from a particular set of publications 
and should be interpreted as trends rather than as definitive. 

Originality/value — This study, which covers three fields with different structures and traditions, 
provides a snapshot of some features of the whole of Brazilian science, and will provide evidence for 
new science policies. 

Keywords Brazil, Astronomy, Oceanography, Human rescurce management, Information research 
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Introduction 
The establishment of Brazilian S&T activities is a recent event in the country’s history. 
The most important governmental funding agencies, CNPq and CAPES[1], began 
operations at the beginning of the 1950s. However, a national programme of training 
Brazilians to carry out research was implemented only in the 1980s. During the last 
two decades, the large number of fellowships granted by these agencies to students 
enrolled in graduate courses was decisive for the growth of Brazilian science (Leta et al, 
1998). 

The large but recent investment in training researchers has led to an expansion of 
Brazilian science and technology activities. This has been frequently estimated by the 
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Tn order to understand and to map Brazilian science, many bibliometric studies have 
been carried out over the last years. The first analysis of Brazilian scientific literature 
took place in the 1970s (Morel and Morel, 1977). Thereafter, most such studies have 
focused on the analysis of publication trends in specific fields (Azevedo, 1984; 
Meneghini, 1992; Spagnolo, 1990; Rumjanek and Leta, 1996). Recently, Leta and 
Lewison (2003) described the contribution of women in three different fields, 
astronomy, immunology and oceanography. The authors chose these fields mainly 
because they differ a lot in both their subject matter and their share of women 
scientists. Using data on publications in these fields, collected from the ISI database, 
the authors identified and counted Brazilian women’s and men’s outputs. They found 
that women published as muca as men, in terms of both quantity and quality. Studies 
of sex and science using quantitative methods, such as the above and that of Plonski 
and Saidel (2001), are still uncommon in Brazil. Thus, this article has played an 
important role in the discussion of this theme by mapping and monitoring women’s 
participation and success in Brazilian science. 

With the exception of the study carried by dos Santos and Rumjanek (2001) on 
Brazilian immunology, the scientific outputs of astronomy and oceanography in Brazil 
have not been studied. Besides the different characteristics of their subject matter and 
the participation of women, the three fields differ also in their process of 
institutionalisation ih the country. Brazilian astronomy was officially started in 
1827, when the emperor D. Pedro I founded the National Observatory. Immunology 
was started at the beginning of the twentieth century, with the foundation of the 
Oswaldo Cruz Institute[2]. As for oceanography, it was officially started in the country 
during the 1940s when the first academic institute devoted exclusively to teaching and 
research in this field was founded, the Institute of Oceanography of Sao Paulo. 

Concerning human resources and scientific outputs, did the differences in subject 
matter and institutionalisation influence the present state of the art of the three 
subjects in Brazil? In order to seek a response to this question, the present paper aims 
to outline trends in human resources and in publications in these fields in recent years. 


Methodology 

Data on the Brazilian scientific community 

These were collected from the Directory of Brazilian Research Groups, a Brazilian 
database designed and organized by CNPq in 1993 (CNPq, 2004). The agency has 
carried out six national censuses of Brazilian research groups and all this information 
has been incorporated in the dacabase. Although researchers are not compelled to take 
part in the censuses, it is estimated that the database covers around 80-90 per cent of 
the whole Brazilian scientific community. Thus, the database is one of the most 
important mechanisms in the country for: 


* planning national policies; 

* getting rapid and objective information; and 

* preserving an archive of science in Brazil. 
The database is freely available and includes data from research groups, such as 
names and fields of research, the groups’ members and their scientific production. The 


members are classified as researchers, students and technicians and they have to be 
involved on scientific projects at Brazilian universities, research institutes, industries 











etc. “Researcher” in the CNPq database means a scientist who leads a research group or Human resources 


an associate scientist who collaborates with a leader of a research group. Both 
categories of researcher may or may not have a PhD degree: this condition varies a lot 
among the fields. 

Usually, to register a research group and participate in the census, the leader 
requests certification from the administrative research office of the institution where he 
or she is affiliated. That means that the researchers themselves cannot register their 

. groups and so take part in the census. Not all such requests to register are granted. For 
those who receive certification, the leaders are allowed to register the group and return 
all the information requested, such as name, field, affiliation and details of the members 
and publications. 

Although the censuses started in 1993, the method of collecting and the way data 
were compiled was changed after the second one in 1995. So, the time trend for the 
characteristics of the scientific community shown here include only the censuses of 
1997, 2000 and 2002. The data from the 2004 census are not yet available. 


Data source 

The 20 journals with the largest number of articles (LNA) of the three fields were 
selected from the 2003 Journal Citation Report list available at the ISI web site. The 
idea to analyse the fields’ scientific output from this set of journals is justified because 
Brazilian publications represent less than 2 per cent of the ISI database. This means 
that around two Brazilian articles would be found in a journal that publishes 100 
articles per year and 20 in the journal that publishes 1,000 articles annually. 
Publication data from this set of journals (LNA) were then collected from the Web Of 
Science and gave a reasonable chance of finding some Brazilian articles in journals 
with the largest numbers of articles. The searches combined the filter address and filter 
source title. Each group of the 20 LNA selected in the three fields were, separately, 
combined with the address Brazil or Brasil. Publication data from these searches were 
saved and analysed with the help of Excel software. The resulting databases contain 
all information available at the Web Of Science such as full references, authors’ names 
and address(es), accumulated citation numbers and type of publication. 


Data on impact factor 

Since the searches on the Web Of Science gave only the accumulated citations of the 
publications, data on citations had to be normalized to find a comparable impact factor. 
This was then measured by the ratio: number of citations to publications of year X 
(from year X up to October 2004), divided by the number of publications of year X and 
again divided by the number of years from publication to 2004. Thus, there were 26 
Brazilian astronomy publications from year 1988 and they received a total of 510 
citations up to October 2004, so the impact factor was calculated as 
510/(26 x 17) = 1.15. This method allows a comparison of citation data found for 
publications published in the five years studied. 


Results and discussion 

The Brazilian scientific community — number of researchers, qualifications and sex 
Details on the Brazilian scientific community can easily be found on the Directory of 
Brazilian Research Groups database. Table I shows the total numbers of researchers 
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Table I. 

Astronomy, immunology 
and oceanography. 
Numbers of Brazilian 
researchers and of those 


registered on the database in the three fields as well as in the whole of science for three 
censuses. The data indicate that the Brazilian scientific community is growing fast: 
from 1997 to 2002, the total number of Brazilian researchers increased from 33,675 to 
56,891, or by 69 per cent. However, in immunology and oceanography, the increase was 
only about 40 per cent and in astronomy there was a reduction in the number of 
researchers. There are three possible explanations. Some astronomers may have 
declined to participate in the recent censuses; there are few new positions for 
astronomers at Brazilian research institutes and universities; there is a paucity of . 
physics undergraduates and so of graduate students in astronomy and so of 
researchers. A detailed analysis would be needed to understand this tendency. 

The scientific communities in the three fields show differences in the fraction of 
researchers with a PhD degree (Table J). Relative to the total number of researchers, 
astronomers seem to be the most qualified, followed by immunologists, while 
researchers in oceanography are only slightly more qualified than the average for 
Brazilian researchers. The high percentage of PhDs in astronomy may reflect the small 
size of the research community. In a country such as Brazil, where resources are scarce, 
fields with a small number of scientists tend to have the most qualified ones. Another 
factor may be the long tradition of astronomy in Brazil that has allowed it to gain 
international recognition and so require Brazilian astronomers to be more competitive. 
On the other hand, the recent consolidation of oceanography in the country, as well as 
its increasing interest for young Brazilians, are factors that may have contributed to 
the low number of PhD-qualified oceanographers. ; 


2002 


1997 2000 
With PhD With PhD With PhD 
Fields Total n %* Total n %* Total n %° 
Astronomy 182 155 8? 164 148 90° 163 159 987 
Immunology 346 245 7 420 320 76° 502 414 8r 
Oceanography 291 165 577 362 203 56° 416 209 62 
All science 33,675 18536 55° 48781 27,662 57° 56891 34,349 607 


Note: "Represents the share of researchers with a PhD degree of the totel for the respective field and 











with a PhD degree in year 
1997-2002 Source: CNPq database 
1997 2000 2002 
Women Women Women 

Fields Total n %* Total n %* Total n %* 

Astronomy 182 NA 164 37 23 163 '37 23 
Table IL Immunology 346 NA 420 247 5% 502 296 58 
Astronomy, immunology, Oceanography 291. NA 362 150 4 416 168 4% 
oceanography and all All science 33,675 14,139 42 48,781 21,252 44 56,891 26,021 46 


science: number of total 
women researchers in 
Brazil 


Notes: *Represents the share of female researchers to the total for the respective field and year; 
NA=not available i 


Source: CNPq database 








Concerning the sex of the researchers (Table II), the highest female/total ratio was Human resources 


found in immunology at about 58 per cent. Oceanography has an intermediate ratio at 
41 per cent, while the lowest was found in-astronomy at 22 per cent. The share of 
women in the three fields did not grow from 2000 to 2002. The different shares of 
women among the three fields may be related to the numbers of men and women 
enrolled in Brazilian. universities, undergraduate and graduate courses. The high share 
of women in biological sciences as well as in social sciences and humanities compared 
with their low share in engineering and exact sciences has been frequently discussed 
worldwide (for example: Mcgregot and Harding, 1996; Lane, 2004; Tabak, 2002). In 
Brazil, as in most countries, including developed ones, women are still the minority in 
some undergraduate courses such as physics, mathematics and chemistry. The causes 
for such “exclusion” are complex and involve many factors, including cultural, social 
and economic ones. 


The scientific publications registered in the Brazilian database 

Most of the scientific articles published by developing countries, such as Brazil, are not 
read by the international-research community and are therefore “invisible”, i.e. they are 
inaccessible and unknown outside the country (Gibbs, 1995). In Brazil, it is estimated 
that 80 per cent of the country’s scientific literature is published only in domestic 
periodicals. In view of the limited distribution of Brazilian scientific periodicals, that 
most of them are in Portuguese and that the majority are not available electronically, 
most of their articles circulate almost exclusively: within the main Brazilian university 
libraries. 

On the database of the Directory of Brazilian Research Groups, researchers are 
encouraged to register details of all their publications, even these “invisible ones”. 
Thus, this database represents an important resource to map the visibility of the 
country’s scientific output. The pattern of publications registered by Brazilian 
researchers (only by those who have a PhD degree) in the three fields is presented in 
Figures 1 and 2. 

Among the three fields, immunology presents the largest number of publications 
registered in the database, with 6, 752 versus 1,962 in astronomy and 2,631 in 
oceanography. This is probably related to the number of researchers registered within 
each of the three fields (Table 1). Although astronomy has the lowest number of 
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Figure 1. 

Pattern of publications in 
astronomy, immunology, 
oceanography and all 
Brazilian science (Brazil) 
registered at the CNPq 
database, 1997-2000 (%) 
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Figure 2. 

Full articles from Brazil in 
astronomy, immunology, 
oceanography and all 
science (Brazil) registered 
on the CNPq database that 
are published in 
international and national 
journals, 1997-2000, (%) 





publications, their distribution differs from that of the other two fields and of that 
found for all science in Brazil: almost 50 per cent of its publications are classified as 
“full articles”. On the other hand, immunology and oceanography seen to follow the 
pattern of the rest of science: “full articles” represent around 30 per cent of the total 
publications. Such difference may be a consequence of the qualifications of the 
researchers (Table 1) or the size of the research groups. The number of people (mainly 
students) who are involved with the research may change the pattern of publications. 
A researcher that is, for example, advising only a single graduate student who sent an 
abstract (résumé in Figure 1) for a conference in a given year will populate the census 
with this information only. Another researcher, who is advising three or four graduate 
students, each of whom submitted an abstract, would account for more papers. 

If the variable size of group may influence the pattern of publications, the same is 
not true when full articles are analysed according to the origin of the journals in which 
they were published (Figure 2). Among the information contained in the CNPq 
database, researchers are supposed to indicate whether the paper they are régistering 
was published in an international journal or in a national journal. The first set of 
journals is defined as all journals that are not published or/and edited inside the 
country. Thus, it includes not only journals covered by ISI but by any other databases, 
such as MedLine, and some that are not covered by any database. 

It is clearly the preference of astronomers and of immunologists to have their 
studies in international journals: 936 out of 959 (almost 98 per cent) and 1,594 out of 
2,001 (almost 80 per cent), respectively. Conversely, the distribution of full articles in 
oceanography is similar to that found for all science in Brazil, around 50 per cent. The 
difference found here is probably due not only to internal practices and interests of the 
three fields but differences in their institutionalisation in the country. Clearly, research 
in astronomy, more than in the other two fields, is oriented toward international 
interests and peers. This explains the largest number of articles in international 
journals. In two previous studies (Figueira et al, 2003; Leta et al, 2005), it has been 
demonstrated that research articles (original papers) are the ones most frequently 
found among Brazilian articles published in international journals. But, reviews and 
case reports are the most frequent types of publications found in articles published in 
Brazilian national or domestic journals. 






Oceanography [Res 
Immunology [ix F224 


Astronomy [i 


0% 20% 40% 60% 80% 100% 
National journals W International journals 





Brazilian scientific output registered in the ISI databases 
A number of factors cause Brazilians not to publish in journals indexed in the SCI, such 
as the very low number of national journals covered; a supposed low quality of 
scientific production; language barriers; and varying editorial standards and 
submission practices among international journals. Despite all these difficulties, the 
number and the share of Brazilian publications in the ISI databases have increased 
remarkably over the last decades, thus increasing the visibility of its science. The 
increasing numbers of co-authored publications as well as those of people trained for 
S&T are frequently indicated as the reasons for the growth of Brazilian publications in 
the ISI database. 

In the present study, the original idea was to map the visibility of the three fields 
according to the type of journal: the 20 journals with the largest number of articles 
(LNA) and the 20 journals with the highest impact factor (HIF). But, as can be seen 
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from Table M, there is a large overlap on the 20 titles of journals that were in both sets, . 


LNA and HIF, in the three fields. Among the three fields, the lowest overlap of journals 
occurred in immunology: there are only eight journal titles common to the lists of LNA 
and HIF journals; the other 12 titles are in either the LNA journal list or the HIF journal 
list. 

The time trend analysis indicates that Brazilian ISI publications (published in the 20 
journals with the largest number of articles) in the three fields have been increasing, 
see Figure 3. Although the scientific community in astronomy is the smallest, and 


Publications 


Journals 
Fields LNA only Both HIF only LNA only Both HIF only 
Astronomy 7 13 7 144 (4) 400 (13) 8 (1) 
Immunology 12 8 12 245 (12) 181 (8) 2 (1) 
Oceanography 7 13 7 ' 35 (4) 48 (9) 6 (4) 


Notes: Numbers in parentheses are the number of journals where publications were found. In 
astronomy, for example, 544 publications were published in 17 LNA journals (only four journals were 
found in the LNA list of journals and 17 were found in both of the lists). Although the search in the 
Web of Science was performed on the 20 LNA astronomy journals, Brazilian articles were not found in 
three of them 
Source: SCI 
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Table M. 

Overlap of Brazilian 
publications and journals 
in three fields; 20 journals 
with the largest numbers 
of articles (LNA) and the 
highest impact factor 
(HIF) 


Figure 3. 

Number of Brazilian 
ISI-indexed publications in 
the 20 journals with most 
articles (LNA) in 
astronomy, immunology 
and oceanography 
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Figure 4. 

Share of Brazilian 
publications in the 20 
journals with most articles 
in astronomy, 
immunology and 
oceanography in the SCI, 


percent in 1988 and 2003 





remains so (see Table J), publications from this field increased the most. Astronomy 
publications increased from 25 to 183 ( x 7.0) while in immunology and oceanography 
they increased from 20 to 136 ( x 6.8) and from 4 to 25 ( x 6.2), respectively. 

The share of this set of Brazilian publications (Figure 3) of the ISI total publications, 
in 1988 and in 2003, is shown in Figure 4. Among the three fields, astronomy is the one 
with the largest relative share of publications. Brazil’s share in this field, around 1.9 per 
cent, is higher than that for all Brazilian publications in the SCI, which, according to 
MCT (2004), was 1.55 per cent in 2002. On the other hand, the shares of Brazilian 
publications in immunology and oceanography are lower than that found for all 
science. í 

The data presented in Figures 3 and 4 may be related to specific characteristics of 
the fields. The superior performance found for astronomy may be a consequence of the 
fact that Brazilian astronomy publications are more oriented to international work 
which may be related to its tradition in the country and its qualified researchers. 
Moreover, the relatively small number of journals classified by ISI as being in this field 
(42) may also contribute to this difference. This would induce astronomers to 
concentrate their publications in just a few journals. For immunology, the large 
number of journals classified as relevant to this field (113) and the close relationship 
between immunology and other biomedical fields would mean that researchers in this 
area would make use of a larger range of journals. This would include journals not 
classified as immunology but in other biomedical sub-fields. 


Brazilian scientific output registered in the ISI database: journal distribution 

The relationship between Brazilian publications found in the three fields and the total 
number of papers published in the journals where Brazilian publications were found is 
presented in Figure 5. This allows the hypothesis that Brazilian papers have a greater 
presence in journals with many publications (LNA) to be checked. 

At first sight, this does not appear to be correct. However astronomy appears to be 
an exception. In this field, Astrophysical Journal and Astronomy & Astrophysics are the 
two journals with the largest number of articles (2,435 and 1,936, respectively — black 
squares) and, as can be observed in Figure 5, these are the journals where most 
Brazilian astronomers have published. This pattern does not apply to immunology and 
oceanography. In the first two journals in the immunology LNA ranking, Journal of 
Experimental Medicine (n = 1,603) and Transplantation (n = 1, 154), Brazilians have 
published fewer than 20 articles in the five years studied (white triangles). In 
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Number of Articles in the Journals, 2003 


Note: The three fields are represented with different symbols: astronomy (black 
squares), immunology (white triangles) and oceanography (white a Each 
symbol represents a specific journal, which was distributed in the figure 
according to its total number of Brazilian publications in the five years (1388, 
1992, 1956, 2000, 2003) and its number of total publications in 2003 


oceanography (white circles), Estuarine Coastal & Shelf Science and Limnology & 
Oceanography, the two journals with most articles, were not the ones where Brazilian 
oceanographers have published the most. 

Table IV shows the five journals with the most Brazilian articles in the five years 
studied. Publications from astronomy tend to be concentrated in journals with the 
largest number of articles (numbers in parenthesis represent the ranked position of the 
journal). As for the total number of publications found in the five journals with the 


Astronomy n Immunology n Oceanography n 


Astrophysical Journal (1) 118 Transplantation 70 Bulletin of Marine 28 
Proceedings (4) Science (9) 

Astronomy & 72 Infection and Immunity (3) 69 Continental Shelf 13 
Astrophysics (2) Research (12) 

Monthly Notices of the 67 Clinical Infectious 41 Estuarine Coastal and Shelf- 10 
Royal Astron’l Society (3) Diseases (6) Science (20) 

IAU Symposia (7) 63 Journal of Allergy and 32 Marine Chemistry (6) 7 

l Clinical 
Immunology (9) . 

Astronomical Journal (5) 58 Journal of Immunology (19) 31 Marine Geology (16) 7 
Total publications 544 426 83 


Notes: Numbers in parentheses represent the position of each journal on the ranking of journals with 
the largest number of articles 
Source: SCI 
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Figure 5. 

Distribution of Brazilian 
publications over five 
years according to the 
journals’ total number of 
articles in 2003 


Table IV. 

The five journals with the 
largest number of 
Brazilian articles in the 
five years studied (1988, 
1992, 1996, 2000, 2003) in 
each of three fields 
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largest number of Brazilian articles, these were 70 per cent and 78 per cent of total 
Brazilian publications in astronomy and oceanography (378 out of 544; and'65 out of 
83, respectively). In immunology, as publications are more spread, the 243 papers 
found in this set represent 57 per cent of its total. 


Brazilian scientific output registered in the ISI database: the most productive institutions 
It has been demonstrated that Brazilian science is concentrated in a few institutions, 
especially those from the public university system, which are supported by federal and 
state government funds. The University of São Paulo, USP, has been pointed out as the 
most productive institution overall (Leta and De Meis, 1996; MCT, 2004) 

Table V presents the 10 Brazilian institutions that contribute most to publications in 
the three fields in the years studied. As can be noted, USP, the largest university of the 
state system, is the institution that has contributed the most to the Brazilian ISI 
publications in all three fields. One reason that pushes USP to the top of the leading 
Brazilian scientific institutions is the large number of graduate students and!qualified 
scientists engaged in S&T activities developed in it. Another important reason is the 
continuous, stable and growing funds granted to USP’s researchers by FAPESP, a 
state foundation responsible for funding S&T in the state of São Paulo (F. ‘APESP, 2002) 

Just as for all Brazilian SCI publications (Leta and De Meis, 1996), universities are 
also responsible for most of the publications in astronomy, immunology and 
oceanography. In astronomy, seven of the ten most productive institutions are from the 
public university system while in immunology and oceanography this ratio is six out 
of ten. In astronomy, two important research institutes linked to the Ministry of Science 
and Technology, the National Institute of Space Research (INPE) and the National 
Observatory (NO) are, after USP, the most productive institutions in the country. In 
immunology, the Oswaldo Cruz Foundation, in the city of Rio de Janeiro (F IOCRUZ — 
RJ) and in the city of Belo Horizonte (FIOCRUZ - MG), one of the oldest biomedical 
research institutes in the country and linked to the Ministry of Health, appears among 
the most productive institutions. In oceanography, unusually, an enterprise appears in 
the top ten institutions. PETROBRAS, the Brazilian Oil Company (see PETROBRAS, 
2004), is one of the largest oil companies in the world, leading the sector in 
implementation of the most advanced deep-water technology, for oil production. The 
company supports an important research centre “Centro de Pesquisas e 
Desenvolvimento Leopoldo Americo Miguez de Mello” or CENPES, which has 
revenues of approximately 1 per cent of the company’s turnover. 
Brazilan scientific output registered in the ISI database: citations and the most cited 
articles 
One of the most frequently used pieces of information comprised in the ISI database i is 
the number of citations an article receives each year. Such an indicator is often used to 
estimate the quality or the impact of the scientific output of a country or an institution 
or a field. Nevertheless, many critics have questioned this supposed relationship 
between citations and quality, as it is known that there are many reasons to cite an 
article. Despite the limitations of this indicator and the controversy surrounding it, 
Leta and Brito Cruz (2003) have discussed the growth in impact of Brazilian scientific 
publications. According to these authors, the impact factor of Brazilian publications, 
measured by the ratio citations (three years-window): publications, increased:from 1.0 
in 1981 to 1.9 in 1998, 
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Table V. 
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Brazilian publications in 
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Figure 6. 

Citation impact of 
Brazilian publications in 
three fields published in 
four years: data represent 
the ratio of (total number 
of citations received by 
publications from year 
X/number of years after 
publication to 2004)/total 
number of publications 
from year X 


Table VI. 

Citation record of 
Brazilian papers in 
astronomy, immunology 
and oceanography 
published in the SCI in 
1988, 1992, 1996 and 2000 


For the present study, the impact of publications in the three fields wasi analysed 
(Figure 6). Even if we neglect papers from 2000, it is clear that the normalised average 
of citations per paper is increasing in all three fields. It increased from 1.2 to 1.9 in 
astronomy, from 1.2 to 2.6 in immunology, and from 1.1 to 2.4 in oceanography. 
Although impact data were normalised (Methodology, v.s.), only in astronomy was the 
average for 2000 similar to that found for 1996. This suggests that publications from 
this field are more rapidly cited then those from the other two fields. Despite this 
difference, analysis of the uncited publications (Table VI) published in the years 1988, 
1992, 1996 and 2000 shows that this variable represents a very similar fraction of the 
whole publications from the three fields: 26 per cent in astronomy, 22 per cent in 
immunology and 24 per cent in oceanography. 

Details of the most highly cited publications from each of the three Ada are 
presented in Table VI. The cumulative number of citations of these publications 
differs among the fields. In immunology, the range of citations of the! top-cited 
publications was higher than that found for the top-cited publications from astronomy 
which were in turn more highly cited than those from oceanography. 

The type of publication also varies. In astronomy and oceanography, there are 
“reviews” and “notes” or “letters” among the most cited publications, ‘while in 
immunology there are only “articles”. All the top-cited publications were co-authored 
with an international partner, especially from the US. This is, however, not true for 
national collaboration. Among the 15 publications listed in Table VI only three have 
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Note: Data represent the ratio of (total number of citations received 
by publications from year X/number of years after publication to 
2004)/total number of publications from year X 
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1988-2000 Cited Not cited Total 

I 
Astronomy 268 93 361 ; 26 
Immunology 226 64 290 22 
Oceanography 45 14 59 2A 
Source: SCI : F 


1 
1 











Year . Cites Journal Type Institutions Countries 
Astronomy 

1988 168 Astrophysical Journal Article NO US, ZA 
1996 162 Astrophysical Journal Article UFRGS US 

2000 153 Astrophysical Journal Article UFRGS US, NL 
1996 131 J. Astron’l Society of the Pacific Review UFSM FR, IT, US 
1988 113 Astrophysical Journal Letter NO US 
Immunology 

1992 249 Journal of Experimental Medicine Article USP US 

1996 245 Journal of Immunology Article UFMG DE, US 
1996 192 Journal of Experimental Medicine Article FIOCRUZ — RJ, UFRJ US 

1996 127 Clinical Infectious Diseases Article UFCE US - 
1992 106 Transplantation Article HOSP CLINICAS. UK: 
- Oceanography 

1988 2 Limnology and Oceanography Article CNEN, INPA US 

2000 44 Marine Geology Review PETROBRAS ~ RJ FR, UK 
1996 42 Limnology and Oceanography Article INPA US 

1988 36 Limnology and Oceanography Article CNEN, INPA US 

1992 30 Limnology and Oceanography Note USP US 


Notes: Country codes: DE = Germany, FR = France, IT = Italy, NL = The Netherlands, 
UK = United Kingdom, US = USA, ZA = South Africa. For Brazilian institutional acronyms, see 
note to Table V i 
Source: SCI 


more than one Brazilian institution involved. Curiously, although USP was responsible 
for the largest number of publications from the three fields, the university was 
responsible for only two of the top-cited publications. 

As for the subject matter of the top-cited publications, it seems that most are 
oriented toward international interests. This is clear for all the publications in 
astronomy and three of the five publications in immunology. In this field, the most 
cited publication is related to a tropical disease that is endemic in some Brazilian 
regions, Chagas disease. In contrast, in oceanography the most cited publications are 
oriented toward national interests: four are related to the Amazon River, the largest 
river in the country, and one publication that reviews concepts for the understanding of 
deep-water, the focus-of PETROBRAS research. 


Conclusion 
In the present study, data on human resources in S&T activities in Brazil as well as 
data on scientific outputs were used to map the state of the art of three fields: 
astronomy, immunology and oceanography. Within the international literature, there 
are only a few articles on trends in publications in astronomy, immunology and 
oceanography (Uzun and Ozel, 1996; dos Santos and Rumjanek, 2001; Dastidar, 2004). 
In view of the relatively recent institutionalisation of science in Brazil, an outline of 
trends in personnel and in scientific output in fields with different structural and 
historical aspects may help us to understand better the features of Brazilian science. It: 
is clear from the data presented here that there is a close correlation between scientific 
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Table VI. 

Details of the top-cited 
Brazilian publications 
from three fields 
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outputs and the numbers of qualified researchers, although traditions also play a part. 
However, it is important to point out some limitations of this study: the coverage of 
both the Brazilian and the ISI databases. The Brazilian scientific censuses lack some 
data for human resources and publications because the database covers only 80 to 90 
per cent of the whole Brazilian scientific community. As for the SCI, it is known that a 
large fraction of scientific research from developing regions, including Brazil, is 
published in domestic journals (Meneghini, 1992; Krzyzanowski and Ferreira, 1998). 
Thus, since the data shown here were obtained from a particular set of publications 
comprised in the world’s mainstream journals with the highest probability of having a 
Brazilian publication, they should be interpreted as publishing trends of the fields 
studied rather than as definitive. l 

Based on this consideration, we can suggest that the three fields differ a lot in terms 
of the characteristics of their researchers (Tables I and ID. The different distribution of 
the total number of researchers registered in the CNPq database as well as the number 
of PhDs may be a consequence of the different processes of institutionalisation of the 
three fields. 

Trends in scientific output catalogued in the CNPq database indicate that 
astronomy publications differ from those in the other two fields in terms of pattern 
(Figure 1) and international visibility (Figure 2). Trends in scientific output catalogued 
in the SCI corroborate it. Astronomy has the largest number of publications (Figure 3), 
the highest share (Figure 4) and the highest impact (Figure 6), and tends to concentrate 
its publications in a small number of journals (Figure 5). Such differences, however, 
may be a consequence of intrinsic characteristics of each field. In immunology, it is 
known that the similarity and the overlap of the subject matter with other biomedical 
fields contribute to spread publications from immunology journals to others classified 
in other sub-fields. 

Analysis of the addresses of this set of publications confirms the trends observed in 
previous studies. Brazilian public universities are the institutions that contribute most 
to the scientific production of the country (Table V). A curious finding, however, is that 
they were less prominent among the publications cited most (Table VID. 


Notes ! 

1. CAPES and CNPq are, respectively, the acronyms of Coordenação de Aperfeiçoamento de 
Pessoa! do Ensino Superior (Coordination for the Improvement of Higher Education 
Personnel, available at www.capes.gov.br/) and Conselho Nacional de Desenvolvimento 
Científico e Tecnológico (National Council for Scientific and Technological Development, 
available at: www.cnpq.br/english/aboutenpq/index.htm). 


2. For details see: www.fiocruz.br/ingles/index.html 
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Abstract . 
Purpose — Seeks to characterise world astronomy research during the last decade by an analysis of 
papers in the Science Citation Index identified with a special filter and to study Indian output in order 
to identify the leading institutions and authors. 
Design/methodology/approach — Lists of specialist journals and title words of papers were 
selected to create a filter giving high precision and recall for astronomy papers. Some biology papers 
were erroneously retrieved because of ambiguous title words. Potential citation impact was 
determined from journal citation scores, and multiple regression was used to evaluate leading 
countries. 

Findings — Title words added almost a quarter to the list of papers in specialist journals, and the 
final file contained over 96,000 papers. Potential impact increased with more authors per paper and 
more addresses; it was greater for papers from Canada, the UK and the USA, and less for papers from 
China, India and Russia; for other countries the effects of the author's location on potential impact were 
not statistically significant. Indian astronomy output has increased in potential impact, partly through 
greater international co-authorship, but also through indigenous papers. 

Research limitations/implications — The study was confined to one subject area, and impact was 
determined on the basis of journals, not of individual papers. 

Practical implications — Use of title words in addition to journal lists is essential to sub-field 
definition in order to have high precision and recall. Because of the confounding effects of authorship 
numbers, it is necessary to use multiple regression analysis in order to see whether research from a 
given country is significantly better or worse than average. 
Originality/value — Characterises world astronomy research during the last decade by an analysis 
of papers in the Science Citation Index identified with a special filter. 


Keywords Astronomy, Information retrieval, Authorship, India 
Paper type Research paper 


Introduction 

Performance evaluation of scientific units (from university departments to nations) 
requires analysis of research outputs in a given subject. For research groups or 
institutions, lists of papers can often be obtained directly from them, and they allow a 








comparative analysis on the basis of multiple partial indicators (Martin and Irvine, 
1983). But for an evaluation of national performance, reliance is usually placed on an 
analysis of papers selected from large databases (Schubert et al., 1989; Braun et al, 
1994; May, 1997). The Science Citation Index (SCI) is often used for this purpose 
because it covers all scientific fields and includes all the authors’ addresses. 

Subject-based evaluation requires the extraction of an accurate subset of papers 
belonging to the particular subject from all papers listed in the database. The popularly 
accepted method of identifying papers in a particular subject is through journal 
classification. Journals are classified into different subject areas, and all papers in the 
journal are taken to belong to that subject area. However, the definition of subject 
boundaries using journal classification alone is unsatisfactory because there is often an 
overlap between lists of journals grouped by subject (except on the CHI Research Inc. 
system). Moreover, journal coverage ranges from very narrow for highly specialised 
journals to broad for non-specialist journals like Science or Nature, which are classified 
as multidisciplinary journals. When subject boundaries are defined using journal 
classification, papers in general or multi-disciplinary journals will be omitted, and in 
biomedicine these usually comprise the large majority of research outputs (Lewison, 
1996). Given that these journals have high levels of citation impact, it follows that 
using journal classification alone will miss legitimate and prestigious output appearing 
in multidisciplinary journals and those in neighbouring disciplines. 

Other possibilities include using departmental names in the author addresses to 
assign the papers to different subjects (de Bruin and Moed, 1993). This has been found 
to be unsatisfactory as the subject areas of papers often do not correlate well with the 
names of the departments to which the authors belong (Bourke and Butler, 1998). 
Another alternative is to classify individual papers on the basis of their references 
(Glanzel et al, 1999). This works well when applied to a few multi-disciplinary journals 
but would be too cumbersome to apply to the whole SCI in order to identify additional 
papers in a sub-field that were not in specialist journals. Moreover, it cannot cope with 
references to papers in other multi-disciplinary journals. It is, however, now being 
applied routinely by Thomson Scientific in order to include such papers in their 
“essential indicators”, although the procedure is not entirely justifiable. A specific 
counter example would be a paper that draws upon knowledge in a particular field to 
make a contribution in another. Judging by references alone may result in its being 
classified as being part of the first field whereas it should in fact be classified as being 
part of the second. 

Titles are often the best indicator of the content and subject of a paper. However, 
words can have different meanings in different contexts and this is a source of error in 
word-based information retrieval. Some input from a subject expert is invariably 
required to identify the subject area correctly from the title. In this paper we have used 
a combination of all three methods described above to design a title-word based filter 
for the subject astronomy and astrophysics (hereafter abbreviated as ASTRO). Using 
this filter for searching the SCI on CD-ROM, we have captured ASTRO papers that 
appeared in non-specialist or multidisciplinary journals in addition to those in 
astronomy journals. The filter is tuned with relevance input from the subject expert 
ensuring high precision and recall. The global output of ASTRO papers in the decade 
1994-2003 is evaluated in terms of growth and two journal impact factor-based 
measures, one based on the logarithms (LOG) of journal citation scores and one a 
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potential impact category (PIC). Next we have assessed the impact of collaboration 
(multiple authors, addresses and nations) on the perceived quality of output as 
measured by the potential impact category (PIC). Finally, we do a detailed case petudy of 
ASTRO research in India. 


Background 

Astronomy and astrophysics (ASTRO) has been subject to a number of tintiometrie 
evaluations, for which it is regarded as a suitable model because almost all the work is 
conducted in the public domain and without commercial constraints on publication. It 
is also inevitably international in its purview so English is the normal language for 
communication and the SCI gives good coverage of outputs. A comprehensive survey 
of all the ASTRO literature, including books and reports, from 1969 to 1987 was 
published by Davoust and Schmadel (1991). They described the main trends and 
showed that the subject had advanced in terms of paper production, especially in — 
France. More detailed investigations were carried out on particular bibliometric 
aspects by Abt (1981, 1984, 1988, 1990). Jaschez (1992) compared the outputs of five 
European countries (France, Germany, Spain, Sweden and Switzerland) in the 
mid-1980s, and showed that Switzerland, despite having much the gaile output, 
received the most citations per paper. 

Perhaps the most comprehensive examination of ASTRO covered 15 OECD 
countries (van der Kruit, 1994) and compared their outputs with their expenditures on 
research. He showed that in the early 1980s, the countries varied substantially in their 
relative commitment (RC) to ASTRO research (expressed as the ratio of their 
percentage presence in the field to their percentage presence in all science). Italy (1.70), 
The Netherlands (1.40) and Germany (1.36) had the greatest RC values, and Japan (0.39) 
and the three Scandinavian countries: Sweden (0.38), Denmark (0.49) and Finland 
(0.52), the least. Van der Kruit also estimated total ASTRO research expenditure i in the 
15 countries as just over $2 billion p.a. in the early 1990s. A preliminary version of the 
evaluation of ASTRO research based on a title-word filter was presented by the 
authors at the 6th ISSI Conference in Sydney (Basu and Lewison, 2001). 

Subsequently a number of studies of ASTRO research outputs in individual 
countries have appeared, covering Australia (Bourke and Butler, 1995), Brazil (Leta 
and Lewison, 2003), China (Liu and Shu, 1995), India (Barve and Gopal-Krishna, 2002), 
Turkey (Uzun and Ozel, 1996) and the US (Trimble, 2000). One common finding is that, 
although ASTRO research is now increasingly the pursuit of small teams (with a mean 
number of authors per paper of three or more (Fernandez, 1998)), it still has a fair 
number of single-author papers. However, most of these’scientometric studies used a 
rather simplistic delineation of the ASTRO field, based on a list of journals: 


Methodology 

The filter that we developed was designed to capture ASTRO papers lnctutiing ones 
on the solar system) and was calibrated through a partnership between a subject 
expert (AB) and a bibliometrician (GL). First, the titles of articles in specialist (i.e. 
ASTRO) journals were taken from the Science Citation Index (GCI annual CD- ROM) 
and words arranged in descending order of frequency. Each title word from the list was 
checked in turn to see which extra papers it retrieved from the SCI. Some words had to 
be qualified by other “yes” or “no” words to filter out irrelevant titles. Papers were 





downloaded and samples of them were marked as “relevant”, “marginal/don’t know” or 
“irrelevant”. If a word generated many false positives the search expression would 
need to be suitably modified. Similarly, if there were false negatives, more words would 
need to be included. The words were then combined with Boolean expressions to act as 
the title-word filter. The final filter was a combination of as many as 50-60 words. The 
specialist journals were also checked to see if 90 per cent of their papers were 
acceptable without using the title-word filter. Papers from these journals could then be 
downloaded directly. 

To calibrate the filter, random samples were selected from the downloaded titles 
and checked to find those not belonging to ASTRO. The difference between unity and 
the proportion of false positives to sampled papers determined the precision of the 
filter. Similarly, random samples from ASTRO departments were selected from the SCI 
to check for relevant papers not identified by the filter (false negatives). The difference 
between unity and the proportion of false negatives to sampled papers determined the 
recall. The filter was iteratively modified to improve both precision and recall. The 
final value of the calibration factor, C = p/r was 1.02, with precision p = 0.98 and 
recall y = 0.96. 

The papers were classified, on the basis of the journals in which they were 
published, by major field and sub-field (using the classification system of CHI Research 
Inc.), and also in terms of their potential citation impact. This was based on five-year 
citation scores (Cp.4 values, taken from a file provided by the Institute for Scientific 
Information) and was represented by two numbers. One was called “LOG” and was 
based on the logarithm of this score: 


LOG = 1 + 2logip(Co_4 + D) 


Thus a journal with very few citations would have a LOG value of unity, and a 
highly-cited journal such as Nature would score about 5. This scheme follows a survey 
of research administrators (Lewison, 1998) in which they were asked to rate papers in 
“excellent” and “good” journals, relative to ones in “ordinary” journals. The mean 
ratings were about 6 to 1 and 2.5 to 1, respectively. Similar results were obtained from a 
survey of Swedish biomedical researchers (Bienenstock et al, 1996, p. 59). 

For some comparisons, it was convenient to use a categorical approach, and divide 
the journals into four groups, called “potential impact categories” or PICs, ranging 
from PIC1 (low potential impact) to PIC4 (high potential impact). Journals with Co-4 < 
6 were assigned to PIC=1; ones with 6 >=Cp_4>11, PIC=2; ones with 
11 >= C0 — 4 > 20, PIC = 3 and ones with Co-4 >= 20, PIC = 4. The corresponding 
critical values of LOG were 2.69, 3.16 and 3.64. 


Results of bibliometric study 

The filter downloaded 96,584 papers from the SCI for the ten-year period from 1994 to 
2003. Of these 408 (0.4 per cent) were not relevant mainly because some astronomical 
words also have other meanings, for example: 


* Petunia Asteroid Mosaic-Virus (Peamv); 

* Peacock butterfly, Inachis Io Geisha; 

* Monitored atherosclerosis regression study (Mars); 
* Laurence-Moon-Bardet-Biedl Syndrome; 
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Table I. 

The breakdown of 
ASTRO papers in 
journals classed by 
discipline 








* Planet Centrifuge; 

* Satellite DNA Polymorphisms; 

* Titan Soybean; and 

* Venus Flytrap, Dionaea-Muscipula. 
These were manually removed leaving 95,186 papers in the file. Of these, 73,019 
appeared in ASTRO journals. Papers in other journals included: 

* 11,794 in physics journals; 

° 7,250 in earth and space non-astronomy journals; 


* 1,656 in “biomedical” journals (of which 1,035 were in Nature, Science and 
Proceedings of the National Academy of Sciences of the USA); and 


* 1,942 in engineering and technology journals. 


Just under 25 per cent of the papers were retrieved with title words; they were in nearly 
500 different non-specialist journals. “Eponymous” departments (ASTRO, OBSERV, 
PLANET, SOLAR, SPACE, TELESCOPE) published 85 per cent of the papers in 
ASTRO journals but only 47 per cent of papers in “other” journals. 

ASTRO represents about 1.7 per cent of world science. At the beginning of the 
ten-year period, papers in ASTRO journals were growing at the rate of 3 to 4 per cent 
per year. The growth had almost ceased by 2003. However, the contribution from 
physics journals to ASTRO showed a strong overall annual growth of 5.7 per cent, 
reaching close to 3,000 papers in the last two-year period, 2002-2003. Contributions 
from other disciplines to ASTRO, being few (< 200-300 papers per year) were subject 
to annual fluctuations, but generally showed a falling trend (Table JD. 

There were many papers with just one author (21 per cent) or two (27 per cent). 
Papers in top-rated journals (PIC = 4, CO — 4 > 20) were 0.2 per cent of those in 
astronomy journals but 8.3 per cent of those in general journals. The mean LOG value 


i 


Period 1994-1995 1996-1997 1998-1999 2000-2001 2002-2003 Total % 


Earth and space 14,720 15,958 16,364 16,691 16,536 80,269 83.5 
Of which: ASTRO 13,265 14,434 14,988 15,166 15,166 73,019 75.9 
Of which: Other 1,455 1,524 1,376 1,525 1,370 7,250 75 
Physics 1,887 2,047 2,344 2,599 2,917 11,794 123 
Engineering and : 

technology 484 305 307 451 395 1,942 2.0 
Biomed res. 429 370 324 293 240 1,656 17 
Of which: nature 111 111 109 87 95 513 0.5 
Of which: PNASUS 14 3 29 19 8 73 0.1 
Of which: Science 87 131 90 74 67 449 0.5 
Of which: Other 217 125 96 113 70 621 0.6 
Chemistry 49 64 58 58 52 281 03 
Mathematics 23 26 28 18 23 118 0.1 
Other 16 4 47 25 20 122 0.1 
Total papers 17,608 18,784 19,472 ` 20,135 20,183 96,182 100.0 











of journals in ASTRO has risen from about 2.8 to 3.1 from 1994 to 2003. In the same 
period the mean LOG of biomedical journals (including Nature and Science) has grown 
from 3.4 to 4.1. This implies that contributions to ASTRO appearing in these journals 
contribute positively to the mean impact of the field. The impact of journals in 
chemistry, physics, mathematics and engineering are less than in ASTRO journals 
(mean LOG < 3), implying that publications appearing in these journals are likely to 
contribute negatively to the impact of the subject. These are broad conclusions, and do 
not apply to individual papers. For example, an ASTRO paper appearing in a 
high-impact physics journal could increase the potential impact of the field. 


Country performance in ASTRO 

The outputs of 16 countries accounted for 99 per cent of the ASTRO papers in 
1994-2003 (see Table I). Leading countries are the US (46 per cent), UK (13.6 per cent) 
and Germany (13.5 per cent). Together they contributed to 73 per cent of world ASTRO 
publications. While the US share has declined slightly during this period, both 
Germany and UK increased their shares to 14-15 per cent by 2002-2003. 

The highest growth rates were achieved by China (12 per cent) and Spain (8.8 per 
cent), followed by Switzerland, Sweden and Italy (>7 per cent each). Only Russia had 
an overall negative growth during this period (Table I). Countries with major 
commitments to astronomy are Italy (3.3 per cent of its science papers), Russia (3.1 per 
cent), The Netherlands and Spain (2.9 per cent). Several countries had values of relative 
commitment (RC) to ASTRO greater than unity indicating that they regarded the field 
as important. Italy had the highest RC (1.98) followed by Russia (1.87), Spain and The 
Netherlands (1.73 each). Countries with less ASTRO research effort were Japan (0.67) 
and Sweden (0.73). These conclusions broadly mirror those of van der Kruit (1994) cited 
above. Countries where the relative research effort in astronomy increased were Italy, 


Iso Country name % ASTRO Growth, % RC RC change Impact 

World 100.00 18 1.00 0.00 
AU Australia 3.69 45 1.38 0.11 
CA Cana 4.37 0.2 0.97 0.26 
CH Switzerland 2.07 77 1.07 ++ 0.00 
CN China 2.81 12.0 0.94 -- — 0.20 
DE Germany 13.48 5.0 1.53 0.02 
ES Spain 5.00 88 173 0.08 
FR France 9.63 43 1.46 0.00 
IN India 2.65 5.1 1.34 — 0.16 
IT Italy 8.61 7.2 1.98 ++ 0.06 
JP Japan 6.77 6.4 0.67 0.04 
NL Netherlands 451 49 1.73 ++ 0.09 
PL Poland 1.99 63 1.64 — 0.03 
RU Russia 6.93 —0.1 1.87 + — 0.64 
SE Sweden 1.55 74 0.73 + — 0.01 
UK United Kingdom 13.64 5.6 1.53 ++ 0.11 
US United States 46.33 16 1.38 + 0.27 


Notes: Per cent ASTRO on an integer count basis; RC = relative commitment to ASTRO compared 
with all science; impact is mean LOG value for papers minus the world mean value (3.01) 
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Table IL. 

Main characteristics of 
country output in 
ASTRO, 1994-2003 
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Table M. 

Regression results for 
ASTRO papers 1994-1998 
and 1999-2003 








The Netherlands, UK and Switzerland. China decreased its relative contribution in 
astronomy notably. Taking into consideration the high growth rate in ASTRO papers 
in China, we conclude that other disciplines have grown still faster. 

Countries with ASTRO papers in high impact journals (high LOG values), were the 
US and Canada (0.27 higher than world values). Countries with LOG impact lower than 
world values were Russia, China and India. 


Regression results for dependence of LOG values on variables 
The significance of the LOG values was checked by subjecting all the papers to 
multiple linear regression analysis, with independent variables being the year of 
publication, the number of authors, addresses and countries, and the presence of each 
of the individual countries in the address field (A = number of authors; D = number of 
addresses; NC = number of countries). The regression tests for quadratic dependence 
of LOG on the number of authors and addresses, as well as the effect of having a 
particular country in the address list of a paper. The significance values improved 
when quadratic terms in authors and addresses were used and NC was dropped. 
Results for the two quinquennia 1994-1998 and 1999-2003 are shown in Table II. 
The only countries that had a positive and significant relationship to LOG in both 
quinquennia were the US, Canada and UK, while the three countries that had a 
negative influence on LOG in both periods were Russia, China and India. | 


1994-1998 1999-2003 
Variables Sig. (%) Sig. (%) 
(Constant) 2.182 0.00 2.397 0.00* 
A (= No. of authors) 0.087 0.00 0.125 0.00* 
AA (= A‘2) — 0.009 0.00 — 0.012 0.00" 
D (= No. of addresses) 0.178 0.00 0.120 0.00* 
DD (= D^) —0.015 0.00 — 0.007 0.01" 
Year 0.068 0.00 0.014 0.00* 
Australia — 0.044 2.36 0.032 ns. 
Canada 0.215 0.00 0.174 0.00* 
China —0.146 0.00 — 0.154 0.00" 
Germany — 0.002 ns. 0.009 ns. 
France — 0.023 ns. — 0.049 0.01 
India — 0.120 0.00 — 0.168 0.00" 
Italy 0.023 ns. — 0,002 ns. 
Japan 0.076 0.00 0.014 n.s. 
The Netherlands — 0.081 0.00 — 0.003 ns. 
Poland 0.009 ns. 0.057 2.04 
Russia — 0.548 0.00 — 0.561 0.00" 
Spain 0.052 0.26 0.027 n.s. 
Sweden 0.009 ns. — 0.035 n.s. 
Switzerland 0.034 n.s. — 0.061 1.18 
United Kingdom 0.070 0.00 — 0.056 ' 0.00* 
United States 0.347 0.00 — 0.378 i; 0.00* 
Other — 0.097 0.00 — 0.027 ' 0.50 


Notes: “Significant positive contribution to LOG; “Significant negative contribution to LOG 








Collaboration characteristics 

The quadratic relationship of LOG to number of authors (A) and addresses (D) 
indicated by the multiple linear regression is shown in Figure 1. It shows an increase in 
potential citations with larger collaborating groups of institutions. An increase in the 
number of authors per paper also increases potential citations. The effect saturates at 
five authors, falling thereafter with larger numbers of authors. This indicates that 
collaborative papers are more likely to appear in journals with higher impact. 

The distributions of papers in journals subdivided into the four impact categories 
(potential impact category; PIC = 1: low impact to PIC4: high impact) are shown in 
Figures 2 and 3. Close to 45 per cent of single-authored papers appeared in low impact 
journals. The distribution shifts to higher impact categories as author numbers 
increase (Figure 2). A similar trend is seen for numbers of addresses, with fewer papers 
in low impact journals by scientists jointly working from different addresses (Figure 3). 
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Figure 1. 
Co-authorship trends in 
ASTRO papers in SCI, 
1994-2003, show an 
increase in potential 
citations (Log = 1 +2 
log10(CO — 4 + 1), where 
Co.4 is journal five-year 
citation score) with 
numbers of authors (A) 
and addresses (D) per 
paper 


Figure 2. 

Distribution of ASTRO 
papers, 1994-2003, in 
different potential impact 
categories (PICI — low; 
PIC4 — high) with one or 
more authors (A) 
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Figure 3. 
Distribution of ASTRO 
papers, 1994-2003, in 


different potential impact 


categories (PIC1 — low; 


PIC4 — high) with one or 


more addresses (D) 


Figure 4. 

Increase in levels of 
ASTRO research 
publications from India 
between 1994-2003, 
domestic and 
internationally 
co-authored (three-year 
running means) 
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Case study of India 

India produced a total of 2553 ASTRO papers in the ten years 1994 to 2003. It 
contributed an annual average of 2.7 per cent of world literature in the subject and 
ranked 13th after the US, UK, Germany, France, Russia, Japan, Italy, Canada, 
Australia, Spain, The Netherlands and China. During this period its ASTRO output 
increased from about.200 to 300 papers per year, or from about 2.2 per cent of total 
astronomy papers to 2.8 per cent (Figure 4). It has been overtaken by China, which 
increased its percentage contribution in astronomy from 1.9 per cent to 3.6 per cent of 
world papers. The overall Indian contribution to papers in all fields in'SCI has 
increased from 1.9 per cent to 2.2 per cent in 1994-2003. This implies that the relative 
Indian contribution to ASTRO research is higher than in other fields. 

Both Indian-authored and internationally co-authored papers in ASTRO showed 
increasing trends. The proportion of internationally co-authored papers increased from 
31 per cent of total Indian ASTRO papers in 1994 to 42 per cent in 2003. India’s main 
collaborators were the US with 498 joint papers, and the UK, Germany, France and 
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_ Italy with more than 100 papers each. More than 200 papers were co-authored with 
countries other than the 16 analysed here. 

The average potential impact (LOG) of Indian ASTRO papers, which was below the 
world average in the mid-1990s, caught up in 2002-2003 (Figure 5). Over the same 
period China, which doubled its ASTRO output and overtook India in production, had 
a slower growth in potential impact. 


Collaboration and impact 

The distribution of papers in journals of different potential impact categories (PIC) 
showed a decrease in papers in low impact categories (PIC1) with the addition of 
countries to the address list, at least for up to two countries (Figure 6, upper). An 
increase in the proportion of high impact category (PIC4) papers with two foreign 
countries also falls off with additional countries. Papers in the mid-to-high impact 
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Figure 5. 

The average potential 
impact (LOG) of Indian 
and Chinese ASTRO 
papers compared to world 
values, in five two-year 
periods 


Figure 6. 

Distribution of Indian 
ASTRO papers by PIC of 
journal (1 = low, 

4 = high) for papers with 
different numbers of 
foreign collaborators 
(upper) and in specialist (A 
and A) or other journals 
(ower) 
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Figure 7. 

Distribution of Indian 
ASTRO papers: domestic 
(left chart), international 
(right chart) by journal 
potential impact 
categories 





range dominate here. Multilateral collaborations with five or more countries show 
varying trends and should strictly not be used to draw conclusions as their'numbers 
are small (50 papers) and would be subject to year-to-year fluctuations. International 
collaborative papers have higher potential impact both for papers in ASTRO and 
“other” journals publishing astronomy papers (Figure 6, lower). 

Domestic collaboration patterns also showed an improvement in the distribution in 
PIC categories for multiple authorship and addresses, but the effects were marginal. 
Overall, collaboration within India is considerably less than that for other countries, 
notably Japan, which has on average six authors per paper. This could be because 
ASTRO research activity is more dispersed in India. A comparison with Brazil and 
China shows the following: 


* Brazil: 556 addresses, 105 institutions, five cover 50 per cent. 
* China: 1295 addresses, 168 institutions, eight cover 50 per cent. 
* India: 771 addresses, 144 institutions, nine cover 50 per cent. 


The hypothesis is therefore correct. Indian ASTRO output is, relative to its magnitude, 
more dispersed than it is in Brazil and China. 

To understand the reason for the growing impact of Indian papers we look at the 
annual distribution of Indian ASTRO papers in the four PIC categories for domestic 
and internationally co-authored papers (Figure 7). In general, there is ‘a higher 
proportion of papers in PIC3 and PIC4 journals for internationally co-authored papers. 
In the domestic papers there has been a decline in the proportion of papers with lowest 
impact (PIC1 journals). There has also been an increase in PIC4 papers after 1998 for 
both domestic and international papers. Prior to 1998, the proportion of PIC4 papers 
was negligible. Since 1998-1999 the proportion of papers in PIC4 journals is 10 per cent 
for domestic and 20-25 per cent for international papers. This could indicate a thematic 
change in research content. The growth curve of papers also shows a spurt of growth 
at the same time (see Figure 4). 

Thematic content was analysed for a random sample of 600 titles for the first and 
second half of the ten-year period. The main themes are shown in Table IV. It shows a 
change in thematic content towards cosmology and black holes. This could account for 
the increase in higher PIC categories after 1998. 

The leading Indian institutions in terms of overall ASTRO research contribution are 
listed in Table V. Together they contributed to 76 per cent of Indian ASTRO papers. 
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The leading Indian scientists contributing to ASTRO with more than 25 papers in the Journal 
ten years are listed in Table VI. : : 
The mean LOG values lie between 2 and 3, except for S Charkraborty, most of classification 
whose publications appeared in mathematics journals, which have lower overall 
potential impact. 
243 
Discussion and conclusions 
In astronomy, we can define the sub-field quite well with title words although some are 
ambiguous. We get an additional 25 per cent papers from general journals by using the 
filter. There are fewer papers than for biomedicine in general journals and they are of 
lower potential impact, except for the ones in Nature and Science. Internationally and 
nationally collaborative papers are in higher-impact journals, with a large number of 
single-author papers appearing in low impact journals. Collaborative papers with 
multiple addresses are also in higher impact journals. This is probably due to more 
explicit funding via peer-review committees and bigger teams than to multi-lab work 
per se. However, it contrasts with the situation in biomedicine where multi-institution 
papers are in lower impact journals when allowance is made for their having more 
authors (Lewison and Dawson, 1998). 
1994-1998 1999-2003 
Rank Theme Papers Rank Theme Papers 
1 Solar 27 1 Cosmology 51 Table IV. 
2 Cosmology and models 21 2 Solar 41 Leading themes of Indian 
3 Galaxies/galactic disk 18 3 Black holes 36 ASTRO papers, 
4 Pulsars 17 4 Galaxies/galactic disk 26 1994-1998 and 1999-2003 
5 Black holes 13 5 Gravity 17 (> ten papers in five 
6 6 Pulsars 13 years) 
Rank Institution name Papers 
1 Tata Inst. Fundamental Res. (TIFR}, Bombay 469 
2 Indian Inst. Astrophys, Bangalore 462 
3 Interuniv Ctr. Astron and Astrophys (UCAA), Pune 324 
4 Raman Res. Inst., Bangalore 181 
5 Phys. Res. Lab., Anmedabad 140 
6 Jadavpur Univ., Calcutta 100 
7 Univ. Delhi, Delhi 99 
8 Indian Inst. Sci., Bangalore : 82 
9 Banaras Hindu Univ., Varanasi 70 
10 Indian Inst. Technol., Delhi 65 
11 Natl. Ctr. Radio Astrophys (NCRA), Pune 63 Table V. 
12 Uttar Pradesh State Observ., Nainital 57 Leading Indian 
13 Saha Inst. Nucl. Phys., Calcutta 53 institutions contributing 
14 Inst. Phys., Bhuvaneshwar 34 to ASTRO research 
15 Sn. Bose Natl. Ctr. Basic Sci., Calcutta 34 (1994-2003) 
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Table VI. 
Leading Indian ASTRO 
researchers, 1994-2003 





Name Address n LOG 

Sagar, R Uttar Pradesh State Observ., Naini Tal 48 2.48 
Sn Bose Natl. Ctr. Basic Sci., Jd Block, Salt Lake | 

Chakrabarti, S.K. 700098, Kolkata 45 291 
Jadavpur Univ., Dept. Math, Calcutta 700032, | 

Chakraborty, S. W. Bengal 45 1.09 
Tata Inst. Fundamental Res., Homi Bhabha Rd, l 

Antia, H.M. Bombay 400005, Maharashtra 43 2.79 
Tata Inst. Fundamental Res., Homi Bhabha Rd, ; 

Rao, AR. Bombay 400005, Maharashtra 43 2.49 
Indian Inst. Astrophys, Bangalore 560034, ' 

Parthasarathy, M. Karnataka 38 2.35 
Interuniv Ctr. Astron and Astrophys, Pune 411007, | 

Padmanabhan, T. Maharashtra 36 2.86 
Univ. Poona, Tata Inst. Fundamental Res., Natl. Ctr. i 

Saikia, D.J. Radio Astrophys., Pune 411007, Maharashtra 35 2.57 

Seetha, S. ISRO, Satellite Ctr., Bangalore 560017, Karnataka 35 2.62 
Tata Inst. Fundamental Res., Dept. Astron and i 

Singh, K.P. Astrophys, Homi Bhabha Rd, Mumbai 400005 34 2.76 
Interuniv Ctr. Astron and Astrophys., Post Bag 4, 3 

Narlikar, J.V. Pune 411007, Maharashtra 33 2.06 

Srianand, R. IUCAA, Post Bag 4, Pune 411007, Maharashtra 31 3.00 
Raman Res. Inst., Cv Raman Ave, Bangalore 560080, i 

Deshpande, A.A. Karnataka 28 2.46 
Tata Inst. Fundamental Res., Homi Bhabha Rd, i 

Paul, B. Bombay 400005, Maharashtra 28 2.27 

Anantharamaiah, K.R. Raman Res. Inst., Bangalore 560080, Karnataka 27 2.93 


The case study of India showed two phases of growth in ASTRO papers and an 
increased share of internationally co-authored papers with higher potential impact. 
The average potential impact, which was below world average in the 1990s, equalled it 
in 2002-2003. Together with increased international collaboration, which' produces 
papers in higher impact journals, this could also be due to a change in thematic content 
because there were more papers in cosmology and on black koles. A comparison with 
China showed that while China overtook India in terms of ASTRO publications, their 
impact was lower. 
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Abstract 

Purpose — To conduct an analysis of the bibliometric presence of a patient group, the Multiple 
Sclerosis Society, within its relevant biomedical sub-field. 

Design/methodology/approach — Publications in the multiple sclerosis sub-field for 1988-1999 in 
the Research Outputs Database constitute the data-set. Proxy measures, based on funding 
acknowledgement counts, are used to analyse the bibliometric presence of the society in comparison 
with other leading agencies, focusing on visibility, research orientation and research impact. The 
results are discussed within the frame of an evolutionary economics of knowledge production and the 
larger policy debate concerning the public funding of science. 

Findings — The society is the most frequently acknowledged funding agency and it distinguishes 
itself by the clustering of its acknowledgements in the area of clinical investigation. With a high and 
leading research impact, the society is considered an influential actor in the relevant biomedical 
sub-field. 

Originality/value — This paper fills a gap in the literature on the public funding of science by 
drawing attention to the important performance and presence of patient groups as funding agencies. 
Keywords Financing, United Kingdom, Quantitative methods, Aid agencies, Disabled people, 
Publications 

Paper type Research paper 


Introduction 

Publicly funded and produced knowledge{1] plays an important role in stimulating and 
directing downstream research developments (Salter and Martin, 2001)[2]. It is in-this 
context that the role of funding agencies occupies pivotal importance. Braun (1998) 
draws attention to the role of funding agencies in structuring the cognitive 
development of science. Within the family of funding agencies, patient groups present 
an interesting, though neglected[3], phenomenon. The groups, through their network of 
patients, carers and family members, possess a unique resource of experiential 
knowledge and lay expertise{4]. These resources, if appropriately mobilised and 
incorporated, have the potential of prioritising items in the agenda for and practice of 
biomedical and clinical research. Pertinent in this regard is evidence of mismatch at the 
macro level in terms of the distribution of public monies and national disease burden 


Research for this paper was completed at the School of Public Policy, University College London 
whilst the author was working on a project on patient groups and their engagement with drug 
innovation. The discussions and comments of colleagues (Mark Duckenfield, Helen Margetts and 
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Table I. 
Number of patient 
groups, 1997 





(Gross eż al, 1999). There may also be a mismatch at the micro level within a medical 
condition reflected in the discrepancies between treatments investigated by 
researchers and those prioritised by consumers (Tallon et al, 2000). | 

Using the Multiple Sclerosis Society (MSS) as a case study, this paper studies the 
bibliometric presence of a patient group as a funding agency. The analysis is 
comparative in that the performance of the MSS is compared with other leading 
funding agencies, viz. the Medical Research Council and the Wellcome Trust. The 
paper begins by drawing out some evidence of the emergence of patient groups. This is 
followed by the bibliometric analysis and then by a discussion of the evidence. 


The emergence of patient groups | 

With 54 per cent of British groups and 62 per cent of American groups being 
established after 1980 (Table I), Wood (2000, p. 39) christens the year as marking the 
genesis of a “new patient movement”. This increase has been accompanied by an 
expansion in a range of organisational resources. Data reported in Wood (2000) show 
that in 1996-1997, 20 per cent of American groups and 23 per cent of British groups 
reported expenditures in excess of £1 million — with a small percentage having 
expenditures in excess of £5 million. In the UK, not only do the financial resources of 
some groups exceed those of major political parties and other economic interest groups 
(e.g. the Trades Union Congress and Confederation of British Industry), but the groups 


. have become effective in political lobbying (Duckenfield, 2004). 


Patients, often mobilised through an organisation, have become increasingly active 
in articulating their needs and engaging with the health-care system. Accounts of 
patient-carer-based movements in the US provide evidence of the reflexive capabilities 
of lay-people in engaging with and shaping the scientific debates (Anglin, 1997; 
Epstein, 1996; Klawiter, 1999). Similar currents are evident in the UK, where patient 
groups have, in some instances, transformed the National Institute of Clinical 
Excellence’s cost-effectiveness review of new health interventions. In this respect, the 
Alzheimer’s Society sets a high watermark with its success in drawing attention to the 
absence of patient experiences as outcomes within clinical trials (Alzheimer’s Society, 
2000). Research in France on the muscular dystrophies association (Association 
Française contre les Myopathies) demonstrates how members negotiate the orientation 
and practice of research and associated clinical trials while engaging with specialists 
(Rabeharisoa and Callon, 1998). Within the public sector in the UK there is evidence of 


Number of groups, 
Year USA : UK 
t 
Pre-1939 8 i 10 
1940-1959 19 ' 14 
1960-1969 13 i 12 
1970-1979 57 i 54 
1980-1989 122 : 82 
Since 1990 37 i 25 
Total 256 197 


Source: Wood (2000) 
Note: Number of groups by year of establishment 





patient involvement at different research stages, including active participation in The Multiple 
reviewing randomised clinical trial protocols and their biomedical outcomes (Chalmers, Sclerosis Society 
1995; Entwistle et al., 1998; Hanley et al, 2001). In this respect, the on-going clinical 
trials of cannabinoids for multiple sclerosis are a significant example of how a patient 
group provides a broad forum for “experts in experience” to engage with “experts in 
science” (Rangnekar, 2004b). These trends corroborate earlier observations of AIDS 
“disease victims” who, despite substantial resistance, succeeded in presenting 249 
“themselves as credible within the arena of credentialed expertise” (Epstein, 1995, —————— 
p. 409, emphasis in original). 

In the UK, this emergence of patient groups is also evident in the changing sources of 
public funding for biomedical research. A dominant element in this transformation has 
been the relative — and in some instances real — decrease in government contributions 
and the growth of the private-non-profit sector and the pharmaceutical industry (Dawson 
et al, 1998; see Figure 1)[5]. Most notable is the increase in real funding by the 
Association of Medical Research Charities (AMRC) from £165 million to £643 million, 
resulting in its share rising to 32 per cent. Patient groups fall within the category of 
charities. However, too much should not be read into the leading position of AMRC as its 
position is dominated by six groups: the Wellcome Trust, the Imperial Cancer Research 
Fund[6], the Cancer Research Campaign, the British Heart Foundation, the Arthritis 
Research Campaign and the Leukaemia Research Fund (Garnham, 2000)[7]. 

Using the MSS as a case study, the paper proceeds to analyse a patient group’s 
funding behaviour to explore factors underlying the emergence of patient groups. 
Between 1986 and 2000, the Society awarded 435 grants that amounted in total to more 
than £51 million[8]. The value of an individual grant and the frequency of grant-giving 
fluctuated (Table I). The number of grants awarded increased from about -25 per year 
in the early 1980s to about 40 per year in the late 1980s, peaking at 47 grants in 1992. In 


AMRC ENHS OMRC OHEFCs W Pharma ind’y 








Figure 1. 

0% 20% 40% 60% 80% 100% UK public domain 

ae : es i : biomedical research 

Note: AMRC = Association of Medical Research Charities; NHS = National Health Service; funding (numbers of 
MRC = Medical Research Council; HEFCs = Higher Education Funding Councils; Pharma acknowledgements) 


ind’y = pharmaceutical industry 
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Table Il. 


MSS grants (1986-2000): 


basic indicators (1999 
prices) 





that year the total amount allocated exceeded £6 million. After 1992, the Society’ s 
grant-giving has fluctuated with a downward trend. Thus, in 2000 there were only 18 
grants totalling about £1.6 million. Yet, the largest grant, £2.5 million, was awarded in 
1998. 

Variations in the society’s grant-giving behaviour reveal some volatility (Figure 2). 
In 1986-1988, very large grants (> £500,000) accounted for 24 per cent of the amount 
disbursed, whereas in 1998/2000 they accounted for 41 per cent. During the same 
period, their share in the number of grants awarded nearly doubled. Another notable 
trend is the continuing, though diminished, preponderance of small grants (< £76,000). 
In 1986/1988, almost 55 per cent of the grants awarded were of this sizé, though 
accounting for only 15 per cent of the amount disbursed. In 1991/1992, more than half 
the grants awarded were in this class; however, they accounted for only 13 per cent of 
the amount awarded. By 1998/2000, fewer than 42 per cent of the grants were in this 
class and their share of total disbursements fell to under 6 per cent. This grant 
allocation behaviour might enable the Society to mobilise a larger section of the 
scientific community. 


1986-1988 1989-1991 1992-1994 1995-1997 1998-2000 


Number of grants 107 110 98 58 , 62 
Value of grants (£ million) £9.97 £12.15 £13.70 £6. 16 ' £M 
Summary statistics 

Average grant £93,141 £110,408 £139,839 £106,286 £147,367 
Maximum grant (£ million) £1.27 £1.97 £2.26 £0.71 £2.57 
Minimum grant £2,028 £2,804 £3,588 £4,467 £3,100 


Source: Author’s calculations from Patient Group Biomedical Funding Database R&D expenditure 


i deflator from DTI/OST (2001) 


Figure 2. 
Distribution of the MSS 
grants, by size 











m> £500k 


= I - 
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1986-88 1989-91 1992-94 1995-97 1998-2000 
Years 


Note: Left hand columns: distribution by number; right hand columns: distribution by value 








The bibliometric presence of the Multiple Sclerosis Society 

An examination of “influence” requires an analysis of the quality and impact of 
publications resulting from the funding agency’s grant allocations. Some of these 
impacts, such as improvement in medical training or changes in clinical practice, are 
difficult to identify and assess systematically. Within the literature the alternatives 
include using surrogate measures of research quality such as: 


+ “journal impact factor” and “research level” (Garfield, 1972; Davis and Royle, 
1996; Dawson et al, 1998); 


* article-to-article citation rates (Hicks et al, 2002); 

* article citations in patents (McMillan et al, 2000; Narin et al., 1997) and clinical 
guidelines (Grant et al, 2000); l 

* case histories of health technologies (Cockburn and Henderson, 1998); and 


* subjective views of a peer research group (Lewison, 2002) or industrialists 
(Mansfield, 1991, 1995). 


The popularity of bibliometric techniques is because of easy data availability and the 
myriad possibilities of mapping linkages between authors, institutions, geographic 
location, and funding sources (Glänzel and Moed, 2002). This reasonably well-accepted 
practice has been adopted by funding agencies (e.g. Dawson et al, 1998; National 
Science Foundation, 2000), national governments (e.g. Bureau of Industry Economics, 
1996) and multilateral bodies (European Commission, 2003). Basic bibliometric 
indicators can be used to conduct a comparative analysis of the bibliometric presence 
of the MSS. 

The bibliometric research presented here uses the Research Outputs Database that 
was originally developed at the Wellcome Trust and was until recently maintained by 
the Bibliometrics Research Group at City University. Based on the Institute of 
Scientific Information’s Science Citation Index, the database is limited to UK-address 
publications and information for each item is supplemented with the corrected UK 
postcodes of authors and funding acknowledgements (FAs). In addition, specific 
search strings and keywords have been developed to demarcate biomedical sub-fields 
for different disease/research areas. These modifications allow for a comparative 
analysis of funding agencies within a sub-field. The following surrogate measures for 
research orientation and research quality are used: 


* Research tybe (RT)[9]: journals are classified on the basis of the dominant 
characteristics of the papers published and the article-to-article citation patterns. 
Following a methodology developed by the US-based CHI Research Inc., four 
research types have been identified — basic research (RT4), clinical investigation 
(RT3), clinical mix (RT2) and clinical observation (RT1). The use of this indicator 
as a surrogate indicator of research orientation has become an industry standard 
(Dawson ef al, 1998, p. 54). 


* Journal impact levels (1L)[10}: each journal is assigned a score based on the 
citation rates (within the relevant subject area) of the peer-reviewed publications 
in the journal. The score is indicative of the average citations per paper in each 
journal, A score of 4 is assigned to journals that are ranked within the top 10 per 
cent (IA); a score of 3 is awarded to journals in the next 20 per cent (JIL3); 2 for 
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Table MI. 

Top funding agencies in 
MS sub-field, 1988-1999, 
by numbers of funding 





journals in the next 30 per cent (IL2); and 1 for journals ranked within the 
bottom 40 per cent (IL1). 


The multiple sclerosis (MS) sub-field was defined by means of a filter developed by Dr 
Lorna Layward, formerly research manager with the MSS. It had a precision of 0.95 
and a recall of 0.93. For the 12-year period 1988-1999 the sub-field consisted. of 1,332 
UK papers and 2,616 FAs (Table I). The sub-field has increased in size, both in terms 
of the number of papers, which have nearly doubled, and the FAs, which have nearly 
tripled. While the data-set contains 450 funding agencies, most have a nominal 
presence as only ten agencies independently accrue at least 26 FAs (viz. 1 per cent of 
global FAs). These ten agencies collectively account for 53 per cent of all FAs, their 
share having decreased from 61 per cent (1988/1990) to 47 per cent (1997/1999). The 
decrease in later years may be related to trends such as the widening of the sources of 
funding for public-domain research. Indicative of this trend is the increase in the 
number of FAs per paper: from 1.6 in 1988 to 2.2 in 1999. 

The research reveals that even while the MSS’s relative share of FAs has fallen from 
27 per cent (1988/1990) to 14 per cent (1997/1999)[11], it remains the most frequently 
acknowledged source of funding, accounting for 18 per cent of global FAs. The, Medical 
Research Council (MRC) with 13 per cent and the Wellcome Trust (WT) with 11 per 
cent are the other leading agencies. However, even in 1997/1999 the MSS remains the 
most-frequently acknowledged source of funding. 

Consequently, there is little doubt that the MSS has high visibility. Recalling the 
analysis on the temporal distribution of grants (cf. Figure 2), we see that the high 
visibility of the MSS could be because it has provided a regular flow of financial 
resources and maintained a research portfolio with both many small grants and a 
growing number of large ones. 

Figure 3 shows the distribution of FAs across RTs for leading funding agencies. It 
is clear that there are many more FAs in basic research (41 per cent) than papers (33 per 
cent). In contrast, the respective percentages for clinical observation are 5 per cent and 
9 per cent. This suggests that funding agencies disproportionately favour basic 
research. Notably, the MRC and the WT exhibit a strong presence in basic research 
(RT4), which accounts for 53 per cent and 60 per cent of their FAs respectively{12]. In 
contrast, the MSS favours either clinical mix or clinical observation, which accounts for 
39 per cent of its FAs. Moreover, the Society holds the leading share of all the clinical 
observation FAs (25 per cent). It is the distribution of FAs, in particular the society’s 


Years Papers . FAs MSS MRC WT IoN CEC 


1988-1990 241 391 104 67 33 oo 0 


1991-1993 303 579 123 73 62 16 4 
1994-1996 334 729 121 85 100 39 14 
1997-1999 454 917 129 124 105 2 33 
Total 1332 2616 477 349 300 oF 5l 
Per cent 18 13 11 2 2 


Notes: MSS = Multiple Sclerosis Society, MRC = Medical Research Council, WT = The Wellcome 
Trust, IoN = Institute of Neurology, CEC = European Commission 


acknowledgements (FAs) Source: Author's calculation from research outputs database 
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Figure 3. 

Distribution of UK papers 

in multiple sclerosis 

x a o 7 o research by research type, 

0% 20% 40% 60% 80% 100% 1988-1999, for different 

funding 

Note: MSS = Multiple Sclerosis Society, MRC = Medical Research Council; WT = the acknowledgements 
Wellcome Trust; IoN = Institute of Neurology; CEC = European Commission 





leading position in clinical observation and clinical mix, that differentiates it from other 
funding agencies. 

JIL values can be used to compare research impact. Some of the criticisms 
concerning the use of JIL as a surrogate measure of impact can be accommodated with 
an adjustment of JIL scores. The weighted mean JIL score makes allowances for 
differences in research orientation and for differences in visibility (see Appendix). This 
approach is reminiscent of Pinski and Narin’s (1976) approach of “influence weights”. 
In a similar fashion, the weighted JIL score presumes that, ceteris paribus, a funding 
agency with higher visibility, ie. a greater share of FAs, would exercise greater 
influence. 

Even while the MSS and the MRC account for comparable global shares of FAs in 
JILA journals (16 per cent), it is the differences in the internal distribution of FAs across 
journals that explain their mean JIL score. Thus, 40 per cent of the MRC’s FAs are in 
JILA journals compared with 29 per cent for the MSS. However, a different picture 
emerges if we consider the mean-weighted JIL scores (Table IV, Figure 4). The MSS 
emerges as the leading funding agency in terms of research impact. This suggests that 
its high visibility compensates for the inherent bias of a low JIL on account of its 
research orientation: clinical investigation and clinical mix journals tend not to be as 
highly cited as basic research journals. In effect, the high visibility of MSS derives from 








Funding agency ; Mean (unadjusted) Weighted mean 











Multiple Sclerosis Society 2.75 6.96 
Medical Research Council 3.10 5.32 
Wellcome Trust 2.67 3.63 Table IV. 
Institute of Neurology 291 0.68 Journal impact level 
European Commission 2.89 1.72 scores 
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Figure 4. 

Weighted mean journal 
impact level in UK 
multiple sclerosis 
research, 1988-1999, for 
papers acknowledging 
different funding agencies 
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Note: MSS = Multiple Sclerosis Society; MRC = Medical Research Council; WT = the 
Wellcome Trust; IoN = Institute of Neurology; CEC = European Commission 


RT1 (clinical) 


its leading position in RT2 and RT3 papers. Similarly, the MRC and the WT occupy 
leading positions in RT4[13] papers. 

It is clear from the bibliometric analysis that the MSS is a funding agency with high 
visibility in its biomedical sub-field since it is the most frequently acknowledged 
funding agency. In addition, because it has a leading share of FAs in RT3 and RT2 
papers, its research orientation differs from that of other leading agencies. Finally, even 
in terms of research impact the society occupies a leading position. 


Discussion 

The landscape of biomedical research funding in the UK has changed remarkably in 
recent years. Nearly a third of the acknowledgements to funding of research published 
in the public domain are accounted for by the Association of Medical Research 
Charities (cf. Figure 1). This is unparalleled elsewhere in the world. Reviewing this 
performance, Wood (2000, p. 180) concludes that: 


The contemporary state continues to have some record of quite unashamedly using 
associations [viz. patient groups] as cheap mainstream providers and as the central and 
leading funders of basic research. 


It is within this broader trend that the evidence from the multiple sclerosis sub-field is 
situated. From counts of FAs within this subject area, the MSS is found to be the most 
frequently acknowledged funding agency. This position could be on account of the 
Society’s grant portfolio being constituted by a (diminishing) preponderance of small 
grants and an (increasing) presence of very large grants. However, this may not 
represent the actual financial commitment of different funding agencies to:'the MS 
sub-field. 

Braun (1998) suggests that funding agencies play an important role in structuring 
the cognitive development of science. A similar hypothesis for the MSS would appear 
to be acceptable based on the evidence of its strikingly different bibliometric presence 
(cf. Figure 3): the MSS accounts for 21 per cent and 23 per cent of global FAs in RT2 











and RT3 respectively. The corresponding shares accruing to the MRC are 12 per cent 
and 9 per cent and for the WT they are 6 per cent and 11 per cent. Equally important, 
the FAs in this category are of high impact: 21 per cent of the global RT2/JIL4 FAs are 
accounted for by the MSS. From this we may infer that the society is responding to 
evidence of a mismatch between the research agenda pursued by practitioners and the 
objectives desired by consumers (Gross et al., 1999; Tallon et al, 2000). In this respect, it 
is the networks that funding agencies like MSS promote and sustain that engender the 
observed patterns. Rabeharisoa and Callon (1998, 2000a) emphasise the wider context 
of the networks within which patient groups are embedded. Notable, among other 
factors, is the participation of the relevant target community (patients, carers and 
family members) in the grant review process. For example, the Alzheimer’s Society has 
a consumer-led process of research priority-setting involving a national network of 150 
lay members, whose deliberations are fed into a research commissioning panel that has 
50 per cent lay membership[14]. 

Equally, the research orientation exhibited by MSS might also be explained as the 
result of a policy to take new health interventions into clinical practice and/or to 
identify problems and bottlenecks in clinical practice. These might require either the 
assessment of existing or the development of new interventions, probably both. 
Indicative of this is the clinical trial of cannabinoids for MS sufferers (Rangnekar, 
2004b). This trial is a good example of the facilitative role of the MSS in providing a 
bridge between “experts in experience” and “experts in practice”. It would be useful to 
deepen this analysis by examining the processes through which research funding 
decisions are taken, both to identify and contrast the role of lay expertise and 
professionals. Another route for analysis would be to examine citations of papers in 
clinical guidelines (e.g. Grant et al, 2000) as an alternative “impact” factor. 

A final aspect of MSS’s bibliometric presence is impact factor. Within the sub-field, 
the MRC scores the highest (unadjusted) JIL at 3.10 (cf. Table IV). However, it would 
not be too controversial to assume that the highest impact factor journals within a 
sub-field are the most prestigious and engender the widest diffusion. In this respect, it 
is instructive to note that the MSS and the MRC account for similar global shares of 
FAs in JILA (16 per cent). The well-noted bias of lower impact factor in non-basic 
research explains part of the result of lower (unadjusted) JIL (2.75; cf. Table IV). 
Interestingly, even in RT2, the MSS accounts for the leading share of FAs in JILA, 21 
per cent. The corresponding share for the MRC is 16 per cent and for the WT it is 5 per 
cent. If JIL is adjusted to capture “influence” in terms of market share of FAs (cf. 
Table IV}, the MSS leads the table. This suggests that the larger market share of FAs 
accumulated by the MSS enhances its influence within the sub-field. 

This is an interesting result that is better understood within the frame of building 
credibility. Participation in networks is a complex process that requires some “trust” 
between participating individuals and institutions. As trust cannot be purchased — if it 
could then it would have little value (Arrow, 1971) — credibility must be established in 
a currency that is acknowledged within the corresponding network, which in the 
context of a scientific community is peer-reviewed publications. Not surprisingly, even 
enterprises encourage in-house scientists to publish in high-impact peer-reviewed 
journals so as to accumulate credibility (Hicks, 1995; Hicks and Katz, 1997). A similar 
dynamic can be suggested for funding agencies, in particular for non-traditional 
agencies like patient groups. 
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To conclude, the evidence on the bibliometric presence of the MSS establishes that it 


_ has high visibility, a distinct research orientation and high and leading impact factor. It 


is not difficult to see how this generates positive benefits for the wider activities of the 
society. Inverting some of the logic of the argument, we see that the funding of research 
also generates in-house institutional competence that enables the society to engage in 
policy and regulatory forums. This could be another instance of Epstein’s (1996, p. 9) 
observation that as patients “begin to organise and exchange information, the breadth 
and durability of their lay expertise i is enhanced”. 

As leading funding agencies in their biomedical sub-fields, patient groups Te the 
MSS will have an indirect, mediated and reciprocal relationship with corporate 
research agenda. These complex interactions may largely occur through funding of 
scientists that are directly or indirectly connected to the corporate sector. This'is a line 
of inquiry to be pursued in the future. Equally, the role of the scientists, the bridge 
builders, deserves to be investigated. Of interest would be questions of “capture” by 
scientists who at various times have been advisers to the society. 


Notes 


1. Defining boundaries is problematic. However, following Pavitt (1998), it is important for us 
to maintain a distinction between research that is publicly-funded or funded by business. 
This distinction manifests itself in the type of research funded, access rights; and the 
potential for networking. 


2. Despite the allusions of language, it is best to avoid characterising innovation as a linear 
process in what Brooks (1994) has christened the “pipe-line model”. Equally, it is clear that 
the relationship between publicly-conducted and privately-pursued research is complex and 
multi-directional that is sectorally differentiated. Further, the domains do not necessarily 
collapse onto other popularly-used binary opposites, viz. basic and applied research or 
science and technology. 


3. This is largely true with the exception of French scholarship that has examined the evolution 
of techno-economic networks in a specific disease area, ie. Alzheimer’s (Penan, 1996) and the 
role of patient groups in funding biomedical research (Callon, 2001; Rabeharisoa and Callon, 
1998, 2000a, b). However Lewison and Devey (1999) did review the role of the Arthritis 
Research Campaign within this sub-field and found that it played a prominent part. 


4. The Department of Health in the UK heralds this “expert patient” and aims at promoting 
self-assessment of symptoms and management of well being (Department of Health, 2000). 


5. The industry’s research expenditures are based on an assumption that 10 per cent of the 
total R&D budgets are devoted to public domain research (Dawson eż al., 1998). 


6. The Imperial Cancer Research Fund and the Cancer Research Campaign merged in 2002 to 
form Cancer Research UK. 


7. Many of the groups and associations within the AMRC are not necessarily patient groups. 
The leading player, the Wellcome Trust, is the largest non-government source of funds for 
biomedical research. However, recent research reveals that even “small” patient groups 
devote upwards of 20 per cent of their expenditures to funding biomedical ‘research 
(Rangnekar, 2004a). 

8. All financial figures are in 1999 prices, having been deflated by the R&D Spine 

9. Itis normal practice to use the term “research level”; however, to avoid suggesting an ordinal 
ranking or hierarchy between different research domains the author’s preferred term is 
“research type’. While research occurs across a continuum it is nevertheless useful to 
classify a journal based on the dominant characteristics of the research material it publishes. 














10. The use of impact factors has not been without its critics. Some have deemed the surrogate 
measure as “dubious” (Boor, 1982) and others have considered their use as “misleading” 
(Moed and van Leeuwen, 1996). Some of the problems result from operational aspects that 
generate a number of inaccuracies (Moed et al., 1996). Then, there are a variety of statistical 
problems, such as the absence of normalisation (Pinski and Narin, 1976), the skewness of 
citations (Seglen, 1992) and other peculiarities arising from self-citation, the bias favouring 
long articles and citation patterns (see Glanzel and Moed, 2002). A number of these problems 
have been recognised (Garfield, 1998) and there are continuing efforts to bring a sociological 
perspective as a means of developing a “theory of citations” (e.g. Leydersdorff, 1998). 


11. Other leading funding agencies, like the MRC, also exhibit a fall in their share of FAs. 
12. Some 30 per cent of the MSS’s FAs are in basic research. 
18. The fact that MRC occupies a leading position in RT1 requires further analysis. 


14. At the time of this research (c. 2002/2003), MSS wes reviewing its grant review process to 
enhance the participation of the target community. 
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Appendix 
In view of the criticisms of the use of journal impact scores — in particular the well-documented 
evidence of variations in citation rates across research types — it is necessary to make 
modifications to the impact factor. The proposed adjustment presumes that, ceteris paribus, a 
funding agency with higher visibility, i.e. a greater share of FAs, will be considered to be more 
influential. This is reminiscent of Pinski and Narin’s (1976) effort to construct an “influence 
weight” for citations based on the actual number of journal citations. Rather then use a “citation 
market share”; we use a funding agency’s market share of FAs. 

We begin by establishing weights for the funding acknowledgements, 2}, which accrue to a 
single funding agency, a, with JIL; and RL;. This weight, wf is based on the “following equation: 


The weighted JIL index value, If, accruing to funding agency “a” in research level “j” is sdei 
from the following equation: 
If = No x 


It thus follows that the mean-weighted JIL for funding agency “a 


z 4 


Ma » 


across all research levels is: 
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Purpose ~ To test the effect of public-private collaboration on research quality in the UK 

biotechnology industry. 

Design/methodology/approach — Development of an economic model suggesting that 

collaboration results in learning gains that lead to higher quality research and subsequently testing 

of this model using unique data from the UK biotechnology sector from 1988-2001. 

Findings — Collaborative research does indeed improve research quality, although the nature of the 

biotech firm in question seems to be an important factor in determining how strongly positive an effect 

public-private collaboration has on research quality. 

Originality/value — Shows that there exists a growing body of work that points to the increasing 

value of public-private interaction for the performance and growth of high technology science-based 
. firms and industries. However, research on the effects of this interaction on the resulting quality of 

scientific output is scarce. 
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Introduction 
Theorists studying the economics of technological change have traditionally assumed 
the relationship between basic scientific research and technological invention to be 
linear (Romer, 1990). Publicly-funded universities and research institutes advance 
basic science by producing ideas. This body of knowledge is freely accessible in 
scientific journals and has public good attributes. Business firms, drawing from this 
resource, engage in research and developmerit in the search for technological 
inventions. These inventions, protected by patents, then generate rent for the firm 
(Rosenberg, 1988). However, recent literature exploring the interaction of science and 
technology has deepened this simplistic picture by highlighting the blurring of 
boundaries between scientific research and technological invention (Hicks, 2002; 
Murray, 2002; Gibbons et al, 1994). 

On the one hand, traditional repositories of scientific knowledge such as universities 
and research institutes have shifted along the basic-applied spectrum and are 
increasingly involved in commercialising their science via patenting, licensing and si Proceedings: New Information 
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high technology spin-offs. Parallel to this trend, high technology firms have adopted 
“open science” academic norms and now routinely engage in basic scientific research 
and employ scientists who regularly publish in scientific journals[1]. In doing so, 
industry scientists often collaborate with university researchers in solving particular 
scientific problems. Gittelman and Kogut (2002), propose that this public-private 
partnership plays a critical role in the success of high-technology industries’ such as 
biotechnology. Murray (2002) explores this co-mingling of scientific and technological 
networks in an emerging area of biomedicine. Her findings show that science and 
technology co-evolve via inter-linked networks of scientists that bridge the 
private-public divide. Zucker et al (1998), in their study of the US biotechnology 
industry, show that star university-based scientists played a key role in the birth and 
growth of the industry by playing dual roles as entrepreneurs and research scientists. 
Thus, there exists a substantial and growing body of work that points to the increasing 
value of public-private interaction in the evolution of science and technology and in the 
performance of firms and industries. What is still missing, however, is research that 
delves into the effects of this public-private interaction on the resulting “quality” of 
scientific output. 

This constitutes an important gap in the literature since the specific mechanisms by 
which public-private collaboration might improve firm, industrial or even 
regional-wide performance are not well known. Clearly, improvements to research 
quality brought about by public-private collaboration are one obvious channel by 
which firms and industries may benefit from co-mingling with academic expertise. It is 
this assertion, therefore, that is in need of further exploration. 

This paper attempts to fill this gap by comparing the effect on research quality of 
private-public collaboration using unique data from the UK biotechnology sector from 
1988-2001. Specifically, it tests whether the quality of scientific research undertaken by 
UK biotechnology firms is improved by more intensive collaboration with 
universities[2]. The results suggest that collaborative research involving greater 
shares of academic institutions does indeed improve research quality, although the 
nature of the biotech firm in question is an equally importani factor in determining 
how strong a positive effect public-private collaboration has on research quality. 

The rest of the paper is organised as follows: first there is a brief literature review 
pertinent to the work. Next a simple analytical model is presented of public-private 
interaction that focuses on knowledge exchange and learning gains to be had from 
public-private collaborative research. There is then a description of the data and 
variables. The results section estimates the impact of public-private collaboration on 
research quality empirically. Finally, there are some suggestions for future work. 


Literature review 

Open science versus private R&D 

Economic debate on the role of publicly funded science begins with the observation 
that there is a substantial gap between social and private returns to R&D, particularly 
for “basic” research. Many researchers have identified externalities arising from the 
public good aspects of knowledge as the source of the gap (Arrow, 1962). The idea that 
the inability of profit-maximising firms to appropriate the full economic returns from 
R&D is likely to lead to under-investment in research relative to the social optimum 








remains an uncontroversial basis for substantial public support of R&D in general and 
basic scientific research in particular (Arora et al, 1995). 

This broad logic could explain the scientific division of labour wherein universities 
(largely subsidised by the government) working under the “open science” paradigm, 
publish scientific articles that are freely available in publicly accessible journals. Firms 
on the other hand, drawing from this “public” body of scientific know-how, invest in 
R&D, mostly in areas of applied research, in order to secure valuable patents that 
would generate rent. However, more recently this “waterfall” model has come under 
criticism, with the distinction between university “open” science and firm level R&D 
increasingly blurring (Murray, 2002; Hicks, 1995; Brusoni et al, 2001). In recent studies, 
scholars have noted that some for-profit firms organise their research in ways that 
mimic the practices found at universities or publicly funded organisations (Dasgupta 
and David, 1994; Cockburn and Henderson, 1998; Gittelman and Kogut, 2002). In 
particular, large pharmaceutical firms and biotech firms have been found to rely 
heavily on collaboration with academic scientists to improve research productivity and 
regularly publish scientific articles in the open science paradigm[3] (Rosenberg, 1990; 
Stephan, 1996). These findings challenge the traditional understanding of the 
distinction between public and private science. Several strategic advantages have been 
identified with the private sector adoption of open science, including the development 
of absorptive capacity (Cockburn and Henderson, 1998), labour costs reduction (Stern, 
1999), and enhancing firms’ competitive position in a patent race (Parchomovsky, 2000; 
Lichtman et al, 2000). 

Drawing upon Cohen and Levinthal’s (1989) “absorptive capacity” argument, 
Cockburn and Henderson (1998) suggest that firms use pro-open-science incentives to 
develop routines and skills that allow them to utilise effectively the advances in 
publicly funded research. The interviews with senior scientists and management of 
pharmaceutical firms conducted by Cockburn and Henderson (1998) indicated that 
firms strived to develop such capacity by recruiting and rewarding scientific 
employees based on their standing in the rank hierarchy of the public-sector science 
and by encouraging them to actively engage themselves in the academic community. 
Ties to the academic community have also been found to underlie the innovative 
activities of biotechnology firms (Zucker et al, 1998). On the assumption that 
university-affiliated scientists prefer research projects that will lead to publications, 
adoption of open science may help firms attract high-quality academic collaborators. 

In addition, Stern (1999) suggested that there might be a labour cost advantage 
associated with pro-publication firms. Based on the analysis of offers accepted by a 
sample of post-doctoral job applicants, he showed that scientists are willing to accept a 
lower wage in exchange for permission to keep up with research in high quality basic 
science. 

Finally, game theorists suggested that publication of discoveries may be used by 
firms engaging in a patent race to establish a higher standard of prior art and prevent 
competitors from having the patent right to a particular invention (Parchomovsky, 
2000; Lichtman et al, 2000)[4]. 

While there is a significant and growing body of literature that investigates the 
theoretical forces driving the public-private interaction in the production of scientific 
knowledge, the empirical literature is less well developed. Though measuring the 
research output of “open science” and its impact on the rest of the economy presents 
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enormous challenges, both quantitative and qualitative estimates suggest that the rate 
of return to basic research is probably quite high. Direct quantitative estimates place it 
in the order of 25 per cent-40 per cent (Adams, 1990; Mansfield, 1989). Also studies of 
university-industry relations have found a positive effect of university research on 
private sector R&D (Jaffe, 1986; Mansfield, 1989). However, most of these models focus 
on the training and education function of the university sector rather than its research 
output. There is also a stream of literature in economic history that points to the critical 
role of public sector research in laying the foundation for technological advances that 
have had enormous impact on the economy (see, for example, David et al, 1992). An 
example of this is the US pharmaceutical industry, one of the most science-intensive 
sectors of the economy and one where public support for research has been very 
substantial[5]. Further, the biotechnology industry was incubated within academic 
science and close links between academic science and industry continue to be 
commonplace in this industry. While it seems clear that the industry’s rapid rate of 
technological change and impressive economic performance rest on a foundation of 
long-term publicly funded investments in basic science (Comroe and Dripps, 1976; 
Raiten and Berman, 1993; Maxwell and Eckhardt, 1990; Ward and Dranove, 1995), 
attributing specific tangible payoffs to these investments is difficult. Furthermore, 
there has been no discernible work in the literature measuring the impact of 
private-public collaboration in producing scientific knowledge on research quality, a 
question that this paper addresses using evidence from the UK biotechnology industry. 


Citations and co-authoring 

Unpacking the relationship between “open science” and industry R&D poses several 
methodological challenges. Pioneering work by Narin and Olivastro (1992) and Penan 
(1996) used citation patterns, mainly in the domain of patents, successfully to trace 
interaction among researchers and across organisational boundaries[6]. However, 
analysis of citations to publications presents a number of difficulties. Citation is often 
highly ritualised, occurs with variable and often very long lags, and may represent 
negative as well as positive acknowledgement of previous research. Another more 
direct measure of interaction is joint-authorship of papers. Often, scientists working in 
industry collaborate with their academic colleagues on research publications. This 
pattern of co-authorship probably represents the strongest empirical record of the 
interaction of private firms and academic science (see Zucker and Darby, 1995; Zucker 
et al., 1998; Liebskind et al., 1995). Furthermore, as it is often pointed out, citation in the 
age of the word-processor and computerised databases is extremely cheap and easy. 
By contrast, as many researchers can testify from personal experience, joint authorship 
is costly in terms of effort as well as other resources. In order to be willing to 
collaborate on a paper all of the authors must be willing to incur these costs, which 
makes an instance of co-authorship a stronger empirical signal than a citation. Further, 
it appears that co-authorship also shows a qualitatively different kind of interaction 
than does citation. Joint authorship often reflects joint research, which is an 
opportunity for the exchange of tacit knowledge (Heffner, 1981). By contrast citation 
may be seen as an acknowledgement of the exchange of codified knowledge. Citation 
also refers to old knowledge, whereas co-authorship reflects generation and exchange 
of new or current knowledge (Beaver and Rosen, 1978). Thus while citations can often 
be an impersonal referencing to existing knowledge, co-authorship provides evidence 











of joint problem solving and something that represents a much more significant 
investment on the part of the firm. 

Naturally, there are some difficulties with this interpretation. Clearly, co-authorship 
does not capture the entire range of active knowledge exchange among scientists. 
Academic researchers and industry scientists often read each others work, correspond 
informally, listen to conference presentations, serve on professional committees 
together and so on, which all may serve as legitimate conduits for knowledge 
exchange. Further, co-authorship may also reflect a variety of things other than 
exchange of information and joint problem solving. It may be offered as a quid pro quo 
for supplying information or resources such as money or research materials (Cockburn 
and Henderson, 1996). It may also serve as a way to acknowledge intellectual debts and 
in the physical and biological sciences to list laboratory directors or other senior project 


leaders as authors on papers which they may have had very little involvement in. 


writing (Murray, 2002). 

Notwithstanding these issues, we can proceed on the assumption that 
co-authorships represent evidence of a significant investment on the part of the firm 
in developing connections to publicly-funded “open science” research{7]. 


A simple model of private-public interaction 

In this section, a simple model of public-private collaboration in scientific research is 
put forward. As noted in the literature review above, industry and academia often 
co-author scientific papers (Hicks, 2002). What might account for this behaviour? 
Earlier, we have reviewed several reasons that could induce firms to adopt norms of 
open science, Further, there are good reasons to think that the act of collaboration itself 
may enhance research productivity. For example, many collaborations centre around 
the joint use of expensive or unique equipment which might require pooling of 
resources (Narin and Rozek, 1988). Also, tacit knowledge and expertise are often best 
conveyed through collaboration (Beaver and Rosen, 1978; Rosenberg, 1990). In 
addition, with increasing specialisation, interdisciplinary research is arguably made 
more productive by collaborators who bring special expertise and knowledge not 
otherwise available, but crucial to research outcomes (Katz and Martin, 1997). It is this 
last incentive of knowledge exchange and learning in scientific collaboration that is of 
particular interest for the model. If we assume that much of this special expertise is 
housed within academia, then private research that is able to access more of this 
knowledge (through co-authorship) should be of higher quality. 

Let us begin with a simple specification of two researchers: industry researcher i 
and public researcherj[8] who are potential collaborators on a research project. There 
are many ways in which one might model the benefits of knowledge exchange if they 
chose to collaborate. We adopt a simple and natural specification: either researcher can 
benefit by acquiring knowledge from another. Formally, the utility of researcher 7 in 
any period is u;(k;, x) where k; is the knowledge acquired from j in that period, x is the 
numeraire[9], and u;(—) is strictly increasing in both arguments and strictly 
quasi-concave. 

Industry researcher 7 is charged with solving research problems that would benefit 
her firm. However, in doing so, she might require specialised knowledge not available 
within the firm. Often, researchers in universities or public research institutes possess 
these specialised skills required in solving quite difficult scientific problems. Further, 
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the nature of this specialised knowledge is, most often, extremely tacit and cannot be 
transferred via codified means such as reading a research publication or manual. Thus 
collaboration is “necessary” in order to access this knowledge (Polanyi, 1967). Given, 
the “all or nothing” character of problem solving in scientific research, we therefore 
assume that: 





aulki; x) ages ! 


j 


limk —o 


so that small levels of knowledge are extremely valuable. It is important to note that 
private sector researcher 7 can collaborate with more than one academic researcher. In 
which case, each researcher j, contributes k;, knowledge to the collaboration.. 

We further assume that knowledge aange. is costly. Let c; be researcher 7’s cost of 
transmitting a unit of knowledge to any other researcher, and let y be income accrued 
a undertaking published research. If the numeraire consumption of researcher i tis 

— Ciki, then the utility of researcher 2 can be written as[10]: 


ui(kj,.¥ — ciki) 


This specification appears to capture two important aspects of co-authoring behaviour: 
researchers make choices about the knowledge that they transfer{11] and that 
knowledge transfers, though beneficial, are also costly[12]. There are other features 
that are not captured. One is that researchers do not acquire knowledge for their own 
private use. This was excluded for reasons of stylistic simplicity. It is relatively 
straightforward, however, to include this and doing so changes none of the results. 
Another feature is that knowledge does not accumulate in the model. This is less 
straightforward to include, but the complications[6] do not change anything 
fundamental. The model also does not make a distinction between the acquisition 
and transfer of knowledge. One could imagine a different apprcach where researchers 
acquired knowledge at the beginning of a bargaining game and later exchanged it. 
More learned researchers could exchange knowledge at lower cost. However, adding 
this specification does not significantly alter the results of the analysis and the simpler 
specification seems to be sufficient. 


The potential gains from public-private knowledge transfer 

When there is no direct knowledge exchange, k; = k; = 0, the utilities of ihdustry 
researcher 7 and public researcher j equal u;(0,y;) and 4;(0,9), respectively. When there 
is knowledge exchange, the maximum amount of knowledge that 7 is willing to provide 
in order to receive an amount k; from researcher j is implicitly defined by 
tilki, Yi — ciki) = uj(0,¥;). This defines the knowledge offer function for researcher i, 
k;(kj) where: 


t 





dk; 3 
dy qa? O (1) 


The knowledge offer function corresponds to the indifference curve (k;, 2;) space that 
gives industry researcher 7 utility level (0, y) (Figure 1). | 








Public researcher j 


u,j(-) =u(o,y) 
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Note: Formally, the utility of researcher i in any period is u,(k,,x) where k, is the knowledge 
acquired from j in that period, x is the numeraire and u,(—)is strictly increasing in both 
arguments and strictly quasi-concave, Similarly, the utility of researcher j in any period is 
u,(k,,x) where k, is the knowledge acquired from i in that period, x is the numeraire, and 

u (>) is strictly increasing in both arguments and strictly quasi-concave 


The slope of k;(k;) equals the marginal rate of substitution between transmitted 
knowledge k; and received knowledge kj. The marginal rate of substitution, k;(k;) is 
strictly concave since u;(—) is strictly quasi-concave. 

Mutually beneficial knowledge exchange is not always possible. Researcher 7 is 
willing to make a non-negative contribution k; = k;(kj) in return for receiving fj. 
Symmetrically, the partner is willing to contribute k; = k;(k;). The set of mutually 
beneficial exchanges is {(k;,k;) : k; = k;(ki), ki = k,(%), k; =0,k; = 0}, which is 
denoted by the region between the knowledge offer functions in Figure 1. 

A sufficient condition for the feasible set to include points other than the origin is: 


X ak; .. 3k; 
ao) r — 9 => > 
lim, 0 3k; lim,, 0 3k; 0 


which is satisfied by the assumption that: 


: dulk;, x) 

Thus, under our assumptions, there are always potential gains from knowledge 
exchange[13]. This means that when knowledge is verifiable ex ante, knowledge 
transfers can arise through individually rational decisions as a kind of barter[14]. Of 
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Figure 1. 

Payoffs for collaboration 
of two researchers: 
industry researcher 7 and 
public researcher j 
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course, the verifiability assumption is crucial. If knowledge could not be evaluated a 
priori, then i in a one-shot game, the dominant strategy of each researcher would be to 
set k; =0. However, most knowledge barter occurs in the context of repeated 
interactions. Moreover, knowledge may be captured empirically by such filters as 
academic reputation. Given the small-world effect of researchers[15] it is reasonable to 
assume that researchers have a fairly strong signal of the quality of their potential 
collaborator and exercise fair judgement before committing resources to begin joint 
research. 

Thus the model suggests that potential learning and knowledge gains can induce 
co-authorship behaviour across zhe public-private divide. If we also assume that these 
knowledge and learning gains lead to higher states of knowledge in solving particular 
research problems, collaboration should result in research outcomes of higher quality. 
If we assume that scientific communication is characterised by a large number of 
publications that have to be divided according to attributed status, and that research of 
higher quality gets published in journals of higher status[16], we arrive at the testable 
hypothesis that private co-autharship involving a greater share of public contribution 
results, on average, in higher quality publications. If this effect is operative, it may 
therefore highlight one obvious channel of how firms and industries benefit from 
academic input. For example, better quality research in more prestigious journals may, 
among other things, allow young biotechnology firms to form alliances with large 
pharmaceutical firms, help them to raise money and/or attract better quality’ human 
capital[17]. 

In the empirical section below, we test whether greater shares of academie 
participation improve the quality of industry science, using data assembled from the 
UK biotechnology industry during the years 1988-2001. 

! 

Data and variables 
Data collection ; 

The data-set of scientific articles analysed in this paper was obtained from the research 
outputs database (ROD) that contains a record of published research outputs for all of 
UK biomedicine and for 32 biomedical sub-fields during the years 1988-2001. These 
data were obtained from the CD-ROM versions of both the Science Citation Index (SCH 
and the Social Sciences Citation Index (SSCI) produced by the Institute for Scientific 
Information in Philadelphia. The data-set of the entire set of UK biomedical papers 
during 1988-2001 consists of 355,183 scientific papers. Each of these papers is tagged 
with information on its research level (from basic to clinical), its impact (potential 
impact category based on journal reputation), areas of research (32 biomedical 
sub-fields), numbers of authors end addresses on individual papers, its postcode areas, 
domestic and international collaboration details as defined by authors’ addresses, and 
funding information. Of these, 2,915 papers listed at least one author from a 
biotechnology firm (identified by funding acknowledgements) as a contributing author 
and the analysis is carried out anly on this subset. 


Variables 

Research quality. Research quality is the dependent variable and it is the most difficult 
and contentious item to measure. In the broadest terms, research quality may be 
defined in terms of how funded research feeds through to the welfare of society 





through health and wealth creation. Wealth might be reflected in the development of, 
for example, new pharmaceuticals, diagnostic reagents or other medical technology. 
Health benefits, however, are manifest in better patient care and preventive measures 
based on regulation or advice. Therefore, the routes to health creation may be 
considered in terms of research papers leading to, for example, new techniques for 
diagnosis and treatment, improved medical education and training or better clinical 
care based on clear evidence-based guidelines and recommendations. For impacts on 
wealth creation, one could examine rent-seeking activities of firms as linked to their 
research and publishing activity. In this paper however, we seek to determine how the 
community of researchers engaged in biomedical research might benefit from 
public-private collaboration and hence research quality is defined as the effect of 
research on other researchers. Thus, it may reflect the importance or quality of the 
research qua research but it is not necessarily an indicator of clinical utility or wealth 
creation. Subsequent papers will explore the link of research quality to wealth creation 
via the profitability of biotech firms. 

If we define research quality in this way, we can assess the importance of a paper by 
several measures. The two strongest indicators are its actual impact as determined by 


citation counts to the individual paper and its potential impact as judged by the journal - 


in which it is published. The first and second measures are complementary but not 
identical, although papers in high impact journals tend to have more citations (rather 
as the children of very intelligent parents are usually of above average intelligence 
themselves). Both are useful indicators and show how a paper has been judged by two 
different readerships: the general body of researchers and, in the second case; a journal 
editor plus a few specialised reviewers. In this paper, given the difficulties of 
estimating citations over the entire data-set (~ 2,915 papers), journal status has been 
used as a proxy value for research quality. 

Research quality (denoted PIC) is based on a simple system in which journals are 
put into four categories (which take values between 1-4). These are based on the mean 
number of citations to papers published in the journal in a given year that are received 
in the year of publication through the fourth year after publication. The values are in 
increasing order of quality, so that 1 denotes the lowest while 4 denotes the highest 
quality publication. 

Collaboration. Each paper is tagged by the total number of authors (denoted by 
AUT), the number of biotech firms (AUTFIRM), and the number of collaborating 
academic institutions (AUTAC). The data-set includes only those papers which have at 
least one author from a biotech firm. To measure the impact of academic contribution 
on a biotech research paper, the simple ratio of academic institutions to the total 
number of participants on a paper (AUTACRAT)[18] is calculated and the square of 
AUT (AUTSQ) is used to gauge the diminishing returns from academic contribution. 
The idea here is that too many “cooks” may spoil the broth. 

Research level. The research level (RL) of an individual research paper is categorised 
as more basic or more applied with the use of a categorisation developed by Lewison 
and Paraje (2004), based on the presence of one or more of about 100 “clinical” or 
“basic” words in the titles of papers in a given journal and year. This gives a decimal 
number between 1.0 (most applied) and 4.0 (most basic)[19]. 

Control variables. There is evidence to suggest that foreign collaborations can 
impact positively on research quality (Zucker et al, 1998, Narin and Olivastro, 1992). 
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Table I. 

Variables for analysis of 
collaboration effect on 
research quality 


Table I. 
Descriptive statistics 


Table HI. 

Research level by 
category and research 
quality 








y 


The presence of foreign collaborators can be determined from the number a foreign 
addresses (DFOR) on an individual research paper. The firm variable F IRM) can 
account for firm effects, and records by whom the principal biotech author is employed. ` 
There are 114 firms present in the data-set. The year of publication (YEAR) accounts 
for learning effects in doing research by biotech firms. It could be, for instance, that 
there is an overall increasing trend in research quality over time for the firms i in the 
data-set. 

‘A full listing of the key variables and descriptive statistics can be found in Table I. 

Descriptive statistics. Table II provides some basic descriptive informatiori on the 
analytical variables of interest. In Tables I-V, the descriptive statistics found in 











Variable name Definitions 
Research quality (PIC) Potential impact of a research publication. Proxy for 
research quality. Ranked variable from lowest to 
highest quality(1 to 4) 
Total authors (AUT) Number of authors listed on a paper 
Biotech firms (AUT) Number of biotech firms listed on a paper ! 
Academic institutions (AUTAC) Number of.academic institutions listed on a paper 
Academic contribution (AUTACRAT) Ratio of academic institutions to total number of 
collaborators ; 
Square of total authors (AUTSQ) The square of total number of authors! 
Research level (RL) Research level on basic/applied scale of a research 
paper 
Foreign addresses (FOR) Number of foreign addresses listed on a paper 
Foreign addresses dummy (DFOR) Presence of foreign addresses listed on a paper. 1 if 
foreign presence and 0 otherwise l 
Year Year of publication dummy , 
L 
Variable name Mean SD Min Max ; Obs 
1, Research quality 2.467 1.032 1 4 2,915 
2. Total authors 5.600 1.798 1 11 l 2,915 
3. Biotech firms 1.086 0.302 1 3 ! 2915 
4. Academic institutions 4514 1.795 0 7 2,915 
5. Academic contribution ratio 0.722 0.209 0 0.985 2,915 
6. Square of total authors 45.791 13.544 1 121 , 2915 
7. Research level 3.067 0.902 0 3.98 ' 2915 
8. Foreign addresses 0.566 1.306 0 21 : 2915 
9. Year = — 1988 2001 1 2,915 
Research level Research quality mean (PIC) Observations 
Applied 2.02 _ 194 
Medium 2.24 : 659 
Basic 2.58 2,062 








Table I are analysed by research level, research quality and academic contribution. 
The data in Table IN suggest that the more basic the nature of the research, the higher 
is the research quality. There is some evidence in the literature to support this 
tendency. It has been documented that in many fields of research, papers that concern 
fundamental or basic research tend to have greater impact than papers of a more 
applied nature (Lewison and Dawson, 1998; Lewison and Devey, 1999). 

In Table IV, it is interesting to note that there is no discernible increase in the ratio 
of academic collaboration as we move from applied to basic research level. This is not 
so counter-intuitive, however, since the data-set, by definition, has only selected papers 
that have at least one firm co-author who presumably is interested in more applied 
work to begin with. 

Finally, Table V shows some descriptive evidence that increased academic 
contribution on a research paper does improve research quality. The ratio of academic 
participation rises from 0.68 to 0.78 as one moves from lowest to highest research 
quality publications, meaning that more academic input is positively correlated with 
public-private research quality. This result can be checked for its robustness to the 
inclusion of control variables, firm fixed effects and different model specifications. 


Results 

The baseline estimate 

In order to measure the impact of private-public collaboration on research quality, we 
begin with a simple baseline specification: 


PIC;; = a + BLAUTACRAT + BAUT + BsDFOR + BRL (Model 1) 


This specification captures the effect of academic input AUTACRAT on research 
paper quality, PIC, ¢ in year ¢, controlling for total number of authors AUT, the 
presence of foreign collaborators DFOR and research level RL. Table VI, column (1) 
presents ordered logit[20] estimates for model 1. All estimates were also conducted 
using ordered least squares (OLS). The results are comparable save for the explanatory 
power of the model, which is higher in the OLS specifications. In column (1) there is 
statistically significant evidence that higher levels of academic involvement result in 


Research level Academic contribution ratio mean (AUTACRAT) Observations 


Applied 0.70 194 
Medium 0.73 _ 659 
Basic 0.72 2,062 
Research quality, PIC Academic contribution ratio mean (AUTACRAT) Observations 
l 0.67 ` 63l 

2 0.71 850 

3 0.73 874 

4 0.78 560 
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category and academic 
contribution 
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Research quality and 
academic contribution 
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Table VI. 

The effect of academic 
input on public-private 
research quality: 
dependent 

variable = quality of 
research (PIC = 1 to 4) 





O-Logit? (1) O-Logit? (2) O-Logit®” (3) O-Logit? (4) 

1. Academic contribution ratio 0.656" 0.409* 0.676% 0.8157 
2. Total authors 0.073 0.114%** 0.073 0.057* 
3. Total authors squared = — 0.004 = = 

3. Foreign addresses 0.103** ` 0.0973 0.099 0.124 
4. Research level 0.517% 0.5154 0.522 0.4874 
5. Year dummies No No Yes Yes 

6. Firm dummies No No : No Yes 

` R-squared® 0.10 0.10 0.11 0.23 


Notes: Number of observations = 2,915; *Significant at 10 per cent level; **Significant at 5 per cent 
level; ***Significant at 1 per cent level; *Ordered logit regression with research quality as dependent 
variable and academic author ratio, total number of authors, number of foreign collaborators and 
research level as independent variables; "There is also a specification as in (1) but with standard errors 
clustered around individual biotech firms. The results are similar and available upon request; “The 
values of R- squared from OLS regressions, which is a better representation of the model’s explanatory 
power, are given. It should be noted that they are double the magnitude of the O-Logit, which in any 
case is a pseudo-measure and therefore an approximation of the R-squared value from OLS 


higher research quality, with the total number of authors, foreign collaboration and 
research level held constant. 


Are there diminishing returns to total number of authors? 
A question not answered by model 1 is at what point (if at all) do diminishing; returns 
set in with increased total authorship (i.e. do too many scientists spoil the ibroth?). 
Model 2 can be used to estimate this. Specifically, to test for diminishing returns of 
total number of authors, a squared term for number of authors, AUTSQ, is aaded and 
the following can be specified: 


PIC;, = a + B,AUTACRAT + BAUT + BAUTSQ + ByDFOR 
+ BRL 


After the removal of outliers in the total number of authors (some papers had upwards 
of 60 authors) there appears to be a significant curvilinear effect associated with 
adding co-authors and research quality. The threshold number of total authors, after 
which adding authors results in decreasing quality, can be calculated from the 
coefficients for AUT and AUTSQ in Table VI, column (2). This value turns out to be 14. 
This is the number after which adding further authors actually “decreases” the quality 
of the resulting publication. It is important to note that although the coefficient for 
AUTACRAT falls by a third, it remains significantly positive. 





i 
(Model 2) 





Are there learning effects over time? 
There is some evidence to suggest (Cockburn and Henderson, 1996) that firms get 
better at doing research with time. In order to control for these learning effects, Model 


(1) is modified to include year dummies YEARDUM: | 


PIC, = a + B\AUTACRAT + BAUT + B3DFOR + BRL 


[ 
Model 3 
+ Bs YEARDUM ee) 


] 
i 
} 
i 
} 
| 


P 








From Table VI, column (3), we can see that the model specification, as judged by the 
R-squared, improves on this addition. The individual year dummies, however, reveal 
no overall rising time trend in research level nor do they dampen our AUTACRAT 
coefficient. 


Does controlling jor firm effects improve our results? 
Finally, model 4 is modified to include firm dummies FIRMDUM: 


PIC; 4 = a + B,AUTACRAT + BoAUT + B3DFOR + BRL 


(Model 4) 
+ Bs YEARDUM + Ps FIRMDUM 


This is an important addition since not all private involvement is of the same quality. 
By not including some control for the firm from which the private researcher is drawn 
from, we may be excluding valuable information that would improve our estimates of 
the effect of academic collaboration. Indeed, given that we assumed that collaboration 
is not random, it is likely that the best firms attract better and more numerous 
collaborators from academic institutions. From Table VI, column (4), we can see that 
the model specification is further improved, with an R-squared that is double that of 
our baseline results. Moreover, as expected, the effect of academic collaboration 
increases in size and significance with the inclusion of firm dummies[21]. 


Conclusion 

Data from 1988-2001 of UK biotechnology’s research output were used to investigate 
the impact of public-private collaboration on research quality. The theoretical model 
suggests that knowledge exchange and learning gains could induce public-private 
collaboration. Further, we may conjecture that these learning gains could be greater, 
the more intense the public contribution and could eventually lead to higher states of 
knowledge in solving research problems and thus culminate in research publications of 
higher quality. There is evidence for this in the data. Research quality is positively 
correlated with higher levels of academic contribution on a research paper. 

However, this is not the only possible explanation. It may be that instead of 
knowledge gains leading to higher quality research publications, academic 
co-authorship acts as a form of “club membership” required for getting published in 
more prestigious journals. Further, if club effects do influence research quality, it may 
be that academic authors at prestigious universities and research institutes can more 
easily publish in prestigious journals, though no allowance has been made for the 
institutional affiliations of the academic authors. 

However, irrespective of whether club-effects, as opposed to “true” improvements in 
research quality, are the cause of getting published in prestigious journals, getting 
published in and of itself might bring private firms other benefits. For example, getting 
published in top journals might allow young biotechnology firms to raise money more 
easily, attract better human capital and form alliances with large pharmaceutical firms. 
The findings in this paper therefore set the stage for investigating how public-private 
collaboration might have an impact on firm performance (i.e. profitability, cost of 
capital, growth). This appears to be a fruitful area of future research and will be 
pursued in subsequent papers by this author. 
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Notes - i 
1. 
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The overwhelming majority of scientific publications still originate from the public sector. In 
comparison, firms play a marginal (but not unimportant) role in scientific publication. 
However, when firms publish they often do so in collaboration with academic researchers. 


In this paper, the terms “university”, “public” and “open” are used interchangeably when 


reference is being made to scientific research conducted without business firms. | 


. Often publication counts of large pharmaceutical and biotech firms are comparable to, and 


sometimes exceed the output of similarly sized university departments and research 


` institutes (Hicks, 1995). 


on 


fon) 


“J 
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10. 


11. 


12. 


13. 


14. 


The US patent system is based on a first-to-invent rule and uses prior art as a reference point 
against which new patent applications are evaluated. Since all new patents must meet the 
standard of non-trivial innovation over the prior art, publishing research advances could 
raise the standard of prior art and effectively set the bar higher for competitors seeking to 
patent a similar technology. 


. Between 1970 and 1995, public funding for health related research increased nearly, 200 per 


cent in real terms to $8.8 billion or 36 per cent of the non-defence federal research budget, an 
amount roughly equal to the total research expenditure of all the US pharmaceutical firms 
(Cockburn and Henderson, 1996). 


. A deeper discussion of the methodological issues in using citations can be found in 


MacRoberts and MacRoberts (1989). i 


. See Beaver and Rosen (1979) for a full treatment of using co-authorship as a measure of 


scientific collaboration. : 


. By “public”, we mean researchers whose primary output is within the paradigm of“ open 


science” such as those employed by universities, research institutes and government 


research agencies. 
. The numeraire is the money unit of measure in an economic mcdel and can capture the 


payoffs to the researchers as a result of their collaboration. i 


There is evidence to suggest that scientists who publish in more prestigious journals get 
paid more (Zucker and Darby, 1995). While this is true in both academia and industry, in 
academia the prestige value of a good publication is more immediate (Stern, 1999). Therefore, 


_in the specification y can equally represent generic benefits; income for 7 and academic 


prestige for j. ; 
Evidence and intuition suggests that researchers choose their collaborators with care and 
not randomly. See Invisible College: Diffusion of Knowledge in Scientific Communities, by 
Diana Crane (1972) for a fuller treatment. 


These costs include the usual transaction costs involved in working with others (Landry 
et al, 2003) as well as additional uncertainty and risk involved in collaborating across the 


‘public-private divide. 


In the case where knowledge accumulates, a researcher who already has a lot of knowledge 
would derive less marginal utility from the potential contribution of another researcher. 
However, it would also be reasonable to suppose that a researcher who was already well 
informed would have lower costs of transferring knowledge. The first effect shrinks the set 
of feasible knowledge exchanges, while the second enlarges it. 

It may be necessary for researchers to make expenditures on knowledge acquisitioniin order 


to allow the barter in proposition (1) to occur. In this case, the possibility of knowledge barter 
gives researchers an incentive to engage in costly knowledge acquisition. 








15. For a detailed treatment of small-word effects in scientific collaboration see “The structure of 
scientific collaboration networks” by Newman (2001). 

16. For a detailed analytical model see “Competition among scientists for publication status: 
toward a model of scientific publication and citation distributions” by van Raan (2001). 


17. In subsequent work, these effects will be investigated from the financial and performance 
data of the 114 firms present in the data-set. 

18. The share of academic collaborators is based on the addresses that appear on the 
publication. However, the number of addresses need not necessarily coincide with the 
number of co-authors. At times, several academic authors are based at the same organisation 
while there may be only one author from a firm. In that sense, the measure captures 
collaboration at the institutional level rather than at the individual level. 


19. For a detailed description see Lewison and Paraje (2004). 


20. Logit is a term from econometrics that captures how the model measures the variables, 
which are categorical and not continuous, and then calculates the coefficients. 


21. The same model can be used with standard errors clustered around individual biotech firms. 
This specification marginally refines the results (available upon request). 
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Abstract f 
Purpose — To identify the papers, and publishing journals, describing psychiatry, surgery and 
paediatrics research funded by the National Health Service (NHS) in the UK. To make comparisons 
with non-NHS research and examine the journal impact factors, and importance to clinicians, of 
journals publishing the most NHS research. To consider the implications, including those for research 
assessment. 

Design/methodology/approach — Existing databases were examined: the research outputs 
database (ROD), which contains information on UK biomedical papers; NHS ROD, which. contains 
details of papers on ROD funded by the NHS; lists of journal impact factors. These were combined 
with selective findings from surveys conducted to identify journals read and viewed as important for 
clinical practice by psychiatrists, surgeons and paediatricians. 

Findings — In each specialty many papers publish NHS-funded research and they out-number the 
non-NHS papers in the ROD. They appear in a wide range of journals but ir. each specialty one journal 
is clearly the most used. The impact factors of journals publishing the most NHS research vary 
considerably. In each specialty the journal containing most NHS publications is widely perceived to be 
important by clinicians. 

Research limitations/implications — Much NHS-funded research is also funded by other bodies, 
Clinician survey response rates were between 38 per cent and 47 per cent. The analysis could be 
extended to other specialties. 

Practical implications — Papers published in the few journals in each specialty that are viewed as 
important by clinicians could be given additional credit in assessments. 

Originality/value — This paper describes outputs from NHS research and shows how assessment 
could be extended. 


Keywords Research work, Psychiatry, Surgery, Paediatrics, Journals, Financing 
Paper type Research paper 


Introduction 

The National Health Service (NHS) spent about £500 million on research in the year 
2002/2003 (Department of Health, 2004). In an era of growing accountability it is 
relevant for questions to be asked about what knowledge is being produced and where 
it is being published (Buxton et al., 2000; Croxson et al, 2001; Harrison and New, 2002). 
Aslib Proceedings: New Information At the same time, bodies that fund health research are increasingly being expected to 


oe ee show the benefits from their expenditure in terms that go beyond the criteria 
ol o. 3, $ 

pp. 278-290 SZO 
o emerald Group Publishing Limited The authors gratefully acknowledge the support from the NHS Executive, London, for the NHS 
DO! 10.1108/00012530510590228 ROD Fellowship and support from the Department of Health Policy Research Programme. 


Emerald 











traditionally used in academic assessments such as peer review of the knowledge Journals used for. 
produced (Buxton and Hanney, 1996; Hanney et al, 2004; Lewison, 2003, 2004). This is publication 
especially true for institutions such as the NHS in the UK that could, instead of 

spending money on research, use it directly on providing health care.’ 

One aspect of this issue leads us, in turn, to questions about how far publication in 
peer-reviewed journals, the primary output from research, is an appropriate 
mechanism to use to disseminate research findings and encourage the use of 279. 
research and achieve broader benefits. Although Coomarasamy et al. (2001) found that ~ere 
journals had not been an effective route for the dissemination of research findings, they 
concluded that medical journals were in a unique position to improve dissemination: 
Schein et al (2000) reported that clinicians considered peer-reviewed journals to be 
their most important source of information. 

Given the large number of journals that are published, a key question becomes how 
important are the individual journals in which the research is published. An 
increasingly important way of assessing this is through the impact factors of journals, 
which show the average number of times an article in a particular journal is cited. This 
is a fundamental citation-based measure of the “significance and performance of 
scientific journals” (Glänzel and Moed, 2002). There is some evidence of correlations 
between the impact factors of journals and measures of quality such as rejection rates 
(Yamazaki, 1995), but various technical criticisms are made about the way in which 
journal impact factors (IFs) are constructed (Glanzel and Moed, 2002). Despite 
warnings from the creator of impact factors (Garfield, 1996) they have been used as a 
convenient shorthand in the evaluation of researchers; an approach that has been 
strongly criticised in leading medical journals (Seglen, 1997). For example, problems 
arise because journals in different fields can have very different impact factors. 
(Wellcome Trust and NHS Executive, 2001), which means that. unsophisticated 
application of impact factors could damage the researchers in certain fields (Schwartz 
and Lopez Hellin, 1996). Furthermore, in the recent debate about the pressures 
currently faced by academic medicine (Bell, 2003) one of the key issues has become the 
sometimes inappropriate demands being made for researchers to aim their papers at 
journals with a high JIF (Abbasi, 2004). 

If the issues raised above are to be better addressed, several strands of analysis 
would be informative. It would be useful to know the extent and nature of the 
publication of health research and, in particular, for the link between publications and 
specific research funding to be better understood (Croxson et al, 2001; Butler, 2001). 
The research outputs database (ROD) provides bibliographic information on these 
issues in the UK. It is constructed by the addition of various details, especially on 
funding acknowledgements, to the data about all biomedical publications with at least 
one UK address that are held on the science citation index (SCI) and the social sciences 
citation index (SSCI) (Dawson et al, 1998). The NHS ROD, a subset of the ROD, 
contains details of papers from England that involve some element of NHS financial 
support (Wellcome Trust and’ NHS Executive, 2001). The determination of which 
papers involve NHS funding is not straightforward unless there is a specific NHS 
funding acknowledgement. Therefore, any paper describing research carried out on 
NHS premises is considered to have some element of NHS funding and is included in 
the NHS ROD. The nature of its construction means that the NHS ROD includes a wide 
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range of research, including trials, much of which is primarily paid for by other funders 
but conducted on NHS premises. 

Many practitioners in the medical field still continue to rely on print journals 
(Tenopir et al, 2004). Therefore, notwithstanding the increased focus on individual 
papers that comes as a result of electronic access and the use of bibliographic 
databases such as Medline, it is still important to conduct some analysis at the journal 
level. Attention is now being given to the attitudes taken by clinicians towards 
individual journals. Saha et al. (2003) concluded from such a study that impact factors 
may be a reasonable indicator of perception of quality for general medical journals. 
Other studies, however, suggested that the relationship between impact factors and the 
journals viewed as significant by practitioners varied depending upon the type of 
science being conducted (Lewison et al, 2001). 

To gain a fuller understanding of these i issues and the nature of the research funded 
by the NHS, it is useful to examine the pattern of journal usage for publication in 
various clinical specialities. This should allow common themes to emerge as well as 
allowing analysis of any speciality-specific issues. The first report on the NHS ROD 
(Wellcome Trust and NHS Executive, 2001) provided considerable information about 
overall patterns of publication in each of a number of specialities. This current article 
focuses on the three specialities of psychiatry, surgery and paediatrics, chosen for their 
strongly clinical nature, and examines the pattern of journal usage. For each speciality 
we report on three main strands of analysis. 

First, we consider how many, and what proportion of, English publications, in each 
of the three specialities, has some element of NHS funding and analyse the number of 
journals used to publish NHS and non-NHS research. Second, we identify the journals 
most used as publication outlets for each speciality and examine their JIFs. Third, we 
draw selectively on survey findings to show the variation in importance attached by 
clinicians to the journals most widely used to publish English NHS research. | 


Methodology 

The number of papers and journals 

The NHS and non-NHS data-sets were obtained from the research outputs database 
(ROD) within the period 1990-2001. NHS papers were identified using a filter for 
England consisting of the following three parts: 


(1) Strings “HOSP”, “INFIRM” and “NHS” applied to the address field together 
with a country classification of “ENGLAND”. 


(2) Comparison with postcodes from the adapted NHS organisation codes database 
(OCS), excluding those postcodes relating to parts of the UK other than 
England. The OCS database is managed by the Department of Health (DoH) as 
a source of nationally-agreed NHS addresses and postcodes in England and 
Wales. 


(3) NHS/DoH funding body codes (three letter abbreviations covering 24 separate 
NHS/DoH funding bodies (Wellcome Trust and NHS Executive, 2001). 


Any paper that satisfies one or more of these criteria is brought into the NHS data-set 
(see Figure 1). 

The medical specialties studied here have been defined in terms of filters 

specifically constructed by The Wellcome Trust for the study of the ROD (Wellcome 
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Trust and NHS Executive, 2001). The most recently constructed versions of these 
filters were used to identify the papers relating to psychiatry, surgery and paediatrics 
for both NHS and non-NHS data-sets and the journals in which they were published. 
The data-sets studied covered the periods 1990-1999 for psychiatry and surgery and 
1997-2001 for paediatrics. Differences in the time spans of the data-sets are due to the 
ROD data available at the time of the questionnaire surveys described later. Various 
comparisons were made between the NHS and non-NHS data-sets. 


Identifying key journals and their JIFs 

The journals with the largest numbers of NHS and non-NHS papers were identified 
from the NHS and non-NHS data-sets for each of the three specialties and their journal 
impact factors examined. The JIFs were obtained from the 2001 edition of the on-line 
Journal Citation Reports (CR, Thomson ISI) for psychiatry and surgery and the 2003 
edition for paediatrics. The JIF on the journal Citation Reports is “a measure of the 
frequency with which the ‘average article’ in a journal has been cited in a particular 
year or period”. “The annual JCR impact factor . . . of a journal is calculated by dividing 
the number of current year citations to the source items published in that journal 
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Figure 1. 
NHS ROD dataset 
construction 
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Table I. 
Comparison of the 
numbers of English 
NHS-funded and 
non-NHS papers 





during the previous two years” by the number of articles published in that journal 
during the previous two years (Garfield, 2001). 


Journal importance to chnicians 

The data on perceived levels of importance of individual journals to clinical practice 
were obtained from three questionnaire surveys of practising clinicians. Details of the 
first of these similar surveys have been published previously (Jones et al, 2004). 

Consultants’ names and addresses were obtained for the surveys from two sources. 

First, 1,200 psychiatrists were randomly selected from the members of The Royal 
College of Psychiatry. Second, the Medical Directory for 2003/2004, produced in 
conjunction with the Royal Society of Medicine, provided the names of 2,660 surgeons 
and 2,330 paediatricians. The questionnaire surveys each contained a list of between 30 
and 40 journals relevant for the specialty, based partly on the journals most used to 
publish NHS research in the field. All three lists included some of the leading general 
medical journals, such as BMJ and The Lancet. Recipients were asked to tick up to ten 
journals that they read or consulted on a regular basis to inform their clinical practice, 
adding any extra journal names if necessary, and to rank the top three. 


Findings 

The number of papers and journals 

Table I shows that a large number of English papers are published in each specialty 
and over half the papers always appear in the NHS data-set. Nevertheless, there is 
considerable variation between the specialties in terms of the proportion of the total 
papers that are NHS-funded. Of the 12,301 psychiatry papers 61 per cent were 
NHS-funded and 39 per cent not NHS-funded. For 9,817 surgery papers and 8,594 
paediatric papers the figures were 92 per cent/8 per cent and 73 per cent/27 per cent 
respectively. 

With regard to the journals used for publishing the English papers, Table I shows 
that the numbers of journals are large for all three specialties though the position is 
more complex as some journals- contain both NHS and non-NHS papers.. Of the 
psychiatry journals 67 per cent contained NHS papers and 76 per cent contained 
non-NHS papers. For surgery and paediatrics the figures were 93 per cent/35 per cent 
and 76 per cent/66 per cent respectively. This means that about a third of the journals 
publishing English psychiatry research have not been used to publish NHS research. A 
relatively smaller number of journals (24 per cent) publishing paediatric research, and 
an even smaller number publishing surgery research (7 per cent), have not been used to 
publish NHS research over the time span covered by the data used. 


3 











Papers Journals 
Specialty Period Np % NHS Nj % NHS % Non-NHS % both 
Psychiatry 1990-1999 12,301 6l 1,118 6 > 76 = 42 
Surgery 1990-1999 9,817 92 413 93 35 i 28 
Paediatrics 1997-2001 8,594 73 864 76 66 42 


Notes: Details from research outputs database and the publishing journals in three medical specialties 








Figure 2 shows there is a noticeably higher average number of NHS papers per journal Journals used for 


than non-NHS papers for each of the three specialties. Considerably fewer journals are 
used for the publication of surgery research than for psychiatry and paediatrics 
research, hence the high number of papers per journal for NHS surgical research. 


Identifying key journals and their JIFs 

If we consider the journals used for publication we find that one journal in each 
specialty is very prominent and this is the same journal for both the NHS and non-NHS 
data-sets (see Tables II, III and IV). For all three specialties this top journal is a general 
specialty journal and therefore covers all, or many, aspects of that specialty, eg. 
psychiatry, but does not focus exclusively on any of the sub-specialties, e.g. “old age 
psychiatry”. aa: 

Despite topping both lists, the journal most used for publication within each 
specialty provides a noticeably higher proportion of the NHS data-set than of the 
non-NHS (Figures 3, 4 and 5). Almost inevitably those journals towards the top of each 
list contain both NHS and non-NHS papers, but these figures also show that for each 
specialty there are some journals that contain a significantly higher proportion of NHS 
papers and others a significantly higher proportion of non-NHS papers. . 

The British Journal of Psychiatry publishes the most NHS and non-NHS psychiatry 
papers, with other general psychiatry journals, general medical journals and 
sub-specialty journals also of importance (Table II). Although The British Journal of 
Psychiatry also has the highest journal impact factor for psychiatry-specific journals 
out of those listed, the relationships between publication figures and impact factors 
vary considerably. For example, International Journal of Geriatric Psychiatry, a 
sub-specialty journal with a low impact factor relative to the other journals listed; was 
the second most important journal for publication in the NHS data-set. 
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Figure 2. 

Average number of papers 
per journal for English 
NHS-funded and non-NHS 
research in three medical 
specialties 
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Table I. 

Journals with the most 
NHS-funded and 
non-NHS English 
psychiatry papers: top 
ten in each data-set, 
ranked by number of 
NHS-funded papers 


Table I. 

Journals with the most 
NHS-funded and 
non-NHS English surgery 
papers: top ten in each 
data-set, ranked by 
number of NHS-funded 
papers 








Non-NHS i 
NHS-funded funded : 
n % n % JIF 2001 
Total (all journals) : 7497 100.0 4804 100.0 
The British Journal of Psychiatry 1,038 13.8 294 6l , 41 
International Journal of Geriatric Psychiatry 337 45 75 16 ' 18 
Psychological Medicine 286 3.8 192 40 : 31 
British Medical Journal 216 29 51 11 | 66 
Journal of Psychosomatic Research 173 2.3 57 12 ; 17 
Acta Psychiatrica Scandinavica 133 1.8 54 lil i 21 
International Clinical Psychopharmacology 120 1.6 66 14 | 23 
Journal of Affective Disorders 112 15 39 08 : 19 
Journal of Neurology Neurosurgery and Psychiatry 103 14 7 O01 : 30 
British Journal of Medical Psychology 98 13 98 20 : 07 
Journal of Advanced Nursing 91 12 70 15 ;: 08 
Social Psychiatry and Psychiatric Epidemiology 78 1.0 67 l4 — l4 
Journal of Child Psychology and Psychiatry and Allied : 
Disciplines 0.9 76 1.6 2.7 


65 . . 2 
Psychopharmacology 64 0.9 105 2.2 3.1 
Personality and Individual Differences 24 


Non-NHS 

NHS-funded funded 

n % n % JIF 2001 
Total (all journals) 9,023 1000 794 100.0 i 
British Journal of Surgery 1498 16.6 90 113 ; 38 
Annals of the Royal College of Surgeons of England 836 9.3 60 76 =; 07 
The Journal of Bone and Joint Surger»-British 
Volume 681 75 33 42 + 15 
Clinical Otolaryngology 463 51 47 59 i 07 
British Journal of Oral & Maxilofacial! Surgery 398 44 30 38 ; 06 
British Journal of Plastic Surgery 374 41 17 21 : 08 
Annals of Thoracic Surgery 373 41 38 48 : 20 
Journal of Pediatric Surgery 239 26 14 18 
British Journal of Urology 234 2.6 4 05 ; 16 
Journal of Urology 175 19 32 40 | 33 
Journal of Thoracic and Cardiovasculer Surgery 170 19 19 24 ' 33 
Journal of Orthopaedic Research 27 03 21 26 ' 22 


i 


The leading journal for publication in surgery in both NHS and non-NHS data-sets was 
British Journal of Surgery. Other general surgery journals and sub-specialty journals 
were also important in terms of publication figures though general medical Journals 
were not. British Journal of Surgery had the highest journal impact factor for 
surgery-specific journals, however again the relationship between publication! figures 
and impact factors varied considerably (Table II. 

The journals used for the publication of paediatrics research were more: varied. 
There were some general paediatrics journals, some general medical journals and some 


: 
i 








Non-NHS 

NHS-funded funded 

n % n % JIF 2003 
Total (all journals) 6,247 100.0 2,347 100.0 
Archives of Disease in Childhood 847 13.6 141 6.0 17 
British Medical Journal 167 2.7 52 2.2 72 
Journal! of Pediatric Surgery 144 23 6 03 14 
The Lancet 136 2.2 54 2.3 18.3 
Developmental Medicine and Child Neurology 129 21 45 19 19 
Acta Paediatrica 128 2.0 40 17 1.1 
British Journal of Obstetrics and Gynaecology 120 19 17 0.7 2.0 
European Journal of Pediatrics 107 17 18 0.8 1.2 
Prenatal Diagnosis 74 1.2 28 1.2 15 
British Journal of Haematology 74 1.2 12 0.5 3.3 
Pediatric Research 72 12 33 14 3.1 
The British Journal of Psychiatry 40 0.6 27 12 44 
American Journal of Clinical Nutrition 12 0.2 26 1.1 5.7 
European Journal of Clinical Nutrition 9 0.1 28 12 19 


Pers Indiv 


Diffs a J 


J Psychophamacology” ` 


a, ra J Psychosom Res 


* int J Geriat 


Percentage of non-NHS funded 
psychiatry papers 





4% 6% 8% 10% 12% 
Percentage of NHS-funded psychiatry papers 


14% 16% 


sub-specialty journals both from within paediatrics and from. other disciplines. The 
leading journal for publication in paediatrics, Archives of Disease in Childhood, does 
not have the highest journal impact factor for paediatrics-specific journals and the 
relationship between publication figures and impact factors varies considerably. Some 
journals from other disciplines publish paediatric research and also have higher 
journal impact factors than Archives of Disease in Childhood (Table IV). Archives of 
Disease in Childhood had a clear lead in publication numbers over any other journal 
both within the NHS and non-NHS data-sets. Journal of Pediatric Surgery was third in 
publication order in the NHS data-set but at the bottom of those listed for the non-NHS 
data-set. This journal also occurs in the surgery data-set (Table MD; though over a 
different time period, and the two sets of figures support each other indicating the high 
Jevel of involvement of the NHS in surgery research. 


Journals used for 
publication 


285 





Table IV. 

Journals with the most 
NHS-funded and. 
non-NHS English 
paediatrics papers: top 
ten in each data-set, 
ranked by number of 
NHS-funded papers 


Figure 3. 
Comparison of the ' 
- percentages of 
NHS-funded and non-NHS 
UK psychiatry papers in 
selected journals 
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Figure 4. 

Comparison of the 
percentages of 
NHS-funded and non-NHS 
UK surgery papers in 
selected journals 


Figure 5. 

Comparison of the 
percentages of 
NHS-funded and non-NHS 
UK paediatrics papers in 
selected journals 
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Journal importance to clinicians 

Within each specialty, the leading general specialist journal not only acne the 
greatest number of NHS publications but is also considered important by clinicians 
(see Table V). 

In the specialties of psychiatry and paediatrics these prominent jourrals are 
perceived as important to their clinical practice by more than 80 per cent of the 
respondent clinicians within that specialty. These two journals are membership 
journals and therefore will be received routinely by most questionnaire respondents 
within the respective specialties. In the specialty of surgery the most prominent journal 
is ranked first, second or third most important by 40 per cent of surgeons. It is not a 
membership journal and concentrates on particular sub-specialties. It has “a tradition 
of publishing papers in breast, upper GI (gastro-intestinal), lower GI, vascular, 
endocrine and surgical sciences” in addition to general surgery (British Journal of 
Surgery, 2004). For the specialty of surgery the membership journal, Annals of the 
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NHS papers (%) Ranking 1, 2 or 3 


Psychiatry 

The British Journal of Psychiatry 14 81 
International Journal of Geriatric Podia 45 26 
Psychological Medicine 3.8 29 
Surgery on 

British Journal of Surgery 17 40 
Annals of the Royal College of Surgeons of England 9.3 24 
The Journal of Bone and Joint Surgery — British ` 
Volume 3 75 i 6 
Paediatrics 

Archives of Disease in Childhood - 14 86 
British Medical Journal : 27 . 51 
Journal of Pediatric Surgery 2.3 1 


Royal College of Surgeons of England, is in second place in terms of NHS publications 
and is ranked first, second or third most important to clinical practice by 24 per cent of 
surgeons. The journal with the third highest number of NHS. publications is a 
sub-specialty journal, The Journal of Bone & Joint Surgery — British Volume, and is 
ranked first, second or third in importance by only 6 per cent of surgeons overall. 
The leading specialist journal for paediatrics is Archives of Disease in Childhood. 
The general medical journal, BMJ, has the next highest number of NHS publications 
and its perceived importance to clinical practice is similarly high with more than 50 per 
cent of clinicians ranking it first, second or third. As mentioned above, the NHS 
data-set includes Journal of Pediatric Surgery as the third journal for publication but, 
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Table V. 

The leading journals in 
each specialty with the 
most NHS-funded papers, 
compared with the 
percentages of specialists 
ranking them first, 
second or third to inform 
their clinical practice 


although used for publication of many paediatrics papers, only 1 per cent of 


paediatricians consider it important for their clinical practice. 


Discussion 
Limitations 
Those papers identified as having NHS funding require only some element of NHS 
funding to be so classified. This could be in the form of direct research funding from 
the NHS or that the research was carried out on NHS premises. Many papers in the 
NHS data-set were not funded by the NHS exclusively; indeed, for some papers’ 
contributions came from two or more other funding bodies. F urthermore, the data-sets 
used for the three specialties are not directly comparable as they covered the periods of 
1990-1999 for psychiatry and surgery and 1997-2001 for paediatrics. The analysis was 
limited to the papers and the publishing journals identified by each specialty filter. 
For the questionnaire surveys, the response rates were only 47 per cent for 
psychiatrists, 38 per cent for surgeons and 43 per cent for paediatricians. The journals 
ranked highest for readership by psychiatrists and paediatricians were membership 
journals that are routinely received by most questionnaire respondents. Finally, the 
surveys were cross sectional and therefore unable to track changes and no adjustments 
have been made for the time delay between the publication figures and sae 
surveys. 
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Implications and value | 

The findings presented here contribute to building a fuller picture of the nature of 
journals used to publish English NHS research. They show that a very large number of 
papers are associated in some way with NHS funding but that one leading journal i in 
each specialty is important as a publication outlet and it tops both the NHS and 
non-NHS lists. In all three specialties, this top journal is a general specialist journal 
rather than a general medical or sub-specialist journal. Despite the important role of the 
leading journal, the papers are spread among a very large number of journals i in each 
specialty. 

There is no consistent pattern in the relationship between the journal impact! ifactors 
and the ranking based on number of publications. For psychiatry and surgery, 
however, the top journal in both of these lists is also, of the specialty journals listed, the 
one with the highest JIF. This might be thought to increase the significance of these 
journals. For each specialty, the analysis of the importance to clinical practice of the 
three journals publishing most specialty-related articles again reveals a mixed picture. 
The journal most used for publication is also widely seen as being important to 
clinicians, but this is not so for the journal third on the list for surgery and paediatrics. 
For surgery, the journal most used for publication is important to a markedly lower 
proportion of clinicians than the equivalent journal in psychiatry and paediatrics. This 
reflects the earlier point about British Journal of Surgery covering only some, but not 
all, of the sub-specialties. 

Findings such as these have implications for research funding bodies, clinicians 
who read journals, researchers, and the journals themselves. As noted, the NHS is a 
major funder of research. This article shows that the NHS is making a contribution to a 
large proportion of the biomedical research conducted by English scientists, at least in 
the three specialties considered. This demonstrates the value of the NHS ROD to those 
with responsibility for running the NHS R&D, and those to whom they are in turn 
accountable. 

Turning to the implications for clinicians, we have been able to show which famak 
are publishing most NHS research in their field. Clinical researchers aiming to, inform 
clinical practice could therefore consider targeting their findings at journals that 
publish widely in the relevant specialty or sub-specialty, and are generally perceived to 
be important by the appropriate clinicians. Many of the findings may be in line with 
the general perception on these issues, but they provide some firmer data on which to 
base analysis. This could also have implications for the journals themselves. For 
example, in relation to such findings (Jones et al., 2004) the editor of the British Journal 


of Psychiatry emphasised the responsibility of that journal to ensure advances'in each . 


sub-specialty were reflected in some way in the journal because it was the one 
important to most psychiatrists overall (Tyrer, 2004). Analysis such as that developed 
below could also potentially be useful for those who run journals and who are 
concerned about the increasing attention given to JIFs. A particular problem is the 
claim that the way that JIFs are calculated might mean that parts of journals, for 
example supplements that are valuable to clinicians, might have to be abandoned 
because supplements receive fewer citations and, therefore, lower the JIF (Zetterstrom, 
1999). Perhaps that could be addressed, at least partially, by redressing the balance and 
giving greater attention to clinicians’ perceptions of journals. 


I 


me 





A major issue of concern to researchers, and something that research funding Journals used for 


bodies should consider, is the form of assessment that will be used to evaluate their 
work. Various proposals have been made to extend the evaluation of health research 
and integrate a range of features such as bibliometric analysis and assessment in 
relation to the impact of research on health policy and practice (Buxton and Hanney, 
1996; Hanney et al, 2005). Even within the area of assessment of articles describing the 
knowledge produced, there is scope for new thinking informed by the type of emerging 
data being described here that go beyond the current emphasis on JIFs. 

Whilst analysis of individual papers rather than the journals in which they are 
published could be considered ideal, it is still useful to consider how analysis at the 
journal level could be enhanced. Lewison (2003) raises the desirability of developing 
national indicators of the effectiveness of health research. The perceived importance of 
particular journals to clinical practice in different specialties and sub-specialties in the 
UK could possibly form the basis of an indicator for certain purposes. For this to 
happen, however, surveys similar to those described here would have to be undertaken 
in further specialties. The analysis might then need to consider a combination of 
national and international factors such as: 


* the importance of journals as publication outlets for national research; 

* the internationally applicable impact factors of the journals; and 

* the perceived importance of particular journals to clinical practice within the 
nation at specialty and sub-specialty level. 
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Guest editorial 


Politics and government in the age of the internet , 
Let’s start with some statistics and comparisons. Think back to October 1994 when the 
Labour Party put its conference proceedings on the web, leading the party to claim it 
was the first UK political party with an internet presence. Now, ten years on, the online 
political landscape has changed dramatically. In 2005 it is inconceivable that a political 
party would not have an internet site; in fact, one of Britain’s newest political parties 
(the People’s Alliance) actually launched itself online (Happold, 2003). Similarly, MPs 
with web sites were a rarity ten years ago, so much so that no official statistics exist. 
Even in 2000, Ward reported that only around 16 per cent of MPs were on the net 
(Ward, 2000). Today, the results are vastly different. Epolitix.com hosts web sites for 
MPs and also links to those who choose to host their site elsewhere; currently around 
63 per cent (416 out of 659) of Westminster MPs have a personal web site. 
The party breakdown is also interesting: 


* 97 out of 163 (59.5 per cent) Conservative MPs have a web site; 
+ 258 out of 407 (63 per cent) Labour MPs; 

* 48 out of 55 (87 per cent) Liberal Democrat MPs; 

* four out of four (100 per cent) Plaid Cymru MPs; 

* three out of five (60 per cent) SNP MPs; and 


* both independent MPs (Dr Richard Taylor and George Galloway) are on the web. 
(Figures calculated by the editor based on party strengths on December 11, 2004 
and according to Epolitix web site on that day.) 


Continuing in this vein, Tony Blair was the first Prime Minister to receive a petition by 
e-mail. He also appointed the UK’s first E-envoy in the E-government Unit at the 
Cabinet Office, charged with improving the delivery of public services by joining up 
electronic government services. Local councils now have e-targets and all council 
services (where appropriate) are expected to be online by the end of 2005, thus making 
£1.2 billion of efficiency savings for the Government (Arnott, 2005). Citizens can now 
log on to their local authority and submit planning applications, check their benefit 
entitlement and apply for school places. Oxford University has a professor of 
e-democracy and even political consultations are held online (for example, the Northern 
Ireland Affairs Committee Inquiry into hate crimes gathered evidence at www. 
tellparliament.net/hatecrime/). 

These trends in e-politics and e-government are not confined to the UK alone. In 
2003, 50,000 French expatriates in the US were able to vote over the internet for 
members of the Conseil Superieur des Français de Etranger (CSFE; Upper Council for 
French Expatriots). In fact, commentators in the States have even attributed an election 
win to use of the internet by a candidate: Jessie Ventura’s gubernatorial. run in 
Minnesota in 1998 when a third party candidate with a low budget won a presumed 
two-horse race with a huge swing. 

In June 2000 Bill Clinton became the first President to conduct an internet address 
and the first to appoint a press secretary solely to deal with the internet (Mark 
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Kitchens, appointed as Director of Internet News). The Canadian government has won 
Accenture’s e-government award for the last four years: Ottowa regularly surveys its 
citizens to ask what local and federal services they’d like to see online and builds them 
into a single e-government portal.-All evidence of good progress, but on the negative 
sidé Jessica Cutler lost her job as a junior staffer to Senator Mike DeWine for detailing 
her (sex) life on Capitol Hill in a blog that was picked up by The Washington Post. 

But this is only one side of the e-equation. Interactivity, commentators have 
stressed, is the crux of e-politics and e-democracy. There is clearly no point in parties, 
councils, governments and political players using new technology if the population at 
large remains unconnected. However, here too the signs are encouraging. According to 
the latest estimates from the Office of National Statistics: 


In the second quarter of 2004, 52 per cent of households in the UK (12.8 million) could access 
the Internet from home, compared with just 9 per cent (2.2 million) in the same quarter of 1998 
(Office of National Statistics, 2004). 


It is highest in the 16 to 24 age group (83 per cent) used Internet i in last 3 months (Office of ` 


National Statistics, 2004), 


As.this applies only to those who have internet access at home, a much greater 
percentage of the population presumably has access either at work or via public 
libraries. Nielsen-netratings.com collates data on internet usage in over 70 countries 
worldwide, and the same encouraging trends are evident. Take ae examples from 
the US political arena: 


At home traffic to Democrats.org jumped to 574,000 unique visitors as 43 percent of online 
surfers flocked to a Web page titled “Take Action: Stop the Right-Wing Smears Against John 
Kerry,” which asked voters to'sign a petition against the Sinclair’ Broadcasting Group for 
airing anti-Kerry programming. 


At work-traffic to Democrats.org increased to 1.1 million unique visitors, up from 568,000 


the prior week, as 41 percent of the viewers also visited the same Web page (Dierkes, 2004). 
And, during the period of the National Conventions: 


The Republican National Convention (RNC) boosted at home traffic to the Bush-Cheney 
Website by 50 percent, as compared to a 191 percent jump in home visitors to the John Kerry 
for President Website during the week of the Democratic National Convention (DNC). 
The DNC helped Kerry’s Website climb to its highest number of weekly home visitors, or 
771,000 unique audience, during his campaign, while Bush’s Website attracted 438,000 
. unique visitors during the start of the RNC (see Tables 1 & 2). The Bush-Cheney Website 
reached the highest number of home visitors, or 622,000 unique audience, during the. week 
ending June 27 (Fan, 2004). 


The fact that the Office of National Statistics has only gathered this type of data 
since 1998, and that now whole companies exist to measure global internet 
audiences indicates how use of the net has grown. This obviously impinges on a 
variety of political and -governmental areas, and the articles within this Special 


‘Issue, written during the electoral campaigns in the US and Iraq and published to 


coincide with the UK general election[1], cover some of these topics in greater detail. 
Each of the contributors was approached on the basis of their expertise in a specific 
sphere of e-politics or e-government and it was considered important to have 
contributions from abroad in order to reflect the international nature of politics and 
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government on the Internet. Short resumes of our contributors and their articles 
follow. 

Adrian Cunningham has held the position of DEE Standards aa Initiatives at 
the National Archives of Australia (NAA) since 1998. Adrian is also Secretary of the 
International Council on- Archives (CA) Committee on Descriptive Standards, 
Convenor of the Australian Society of Archivists Descriptive-Standards Committee, 
Chair. of the AGLS Metadata Working Group and a member of Standards Australia’s 
Committee IT/21, Records Management. Adrian was President of the Australian 
Society of.Archivists, 1998-2000. 


Margaret Phillips is Director of Digital Archiving, National Library of Adalia 


(NLA) and managed the development of the NLA’s PANDORA Archive of Australian 
Web publications. She is involved in establishing policy, procedures and infrastructure 
for ensuring long-term access to Australian Internet publications. 

Cunningham and Phillips address the key and often overlooked subject of archiving 
digital formats, an area where the National Library of Australia and National Archives 
of Australia have made great progress but other countries have yet to catch up. They 
examine e-government and the role of archives and libraries have in recording and 
indexing digital formats, and question why this kind of information is so vulnerable. 

Chris Pond brings a wealth of experience to the-journal. An honorary Fellow of the 
Chartered Institute of Library and Information Professionals and an associate Member 
of the Institute of Public Relations, Chris is Head of Reference and Reader Services at 


the House of Commons Library, where he has worked for 30 years. He lectures and - 


writes regularly on Parliament and parliamentary history, and government publishing 
and information. He has a special interest in historical, as well as current, sources in 
this area and sits on the Standing Committee of Official Publications. He has been 
involved in electronic publishing since 1978 and has been keen to explore means of 
electronic information dissemination. As Town Mayor of Loughton, he also takes an 
interest in local government publishing and information. Chris’s article covers the rise 
of the end user and electronic information provision in a unique setting — the House of 
Commons Library. 

‘A Fellow-of the Chartered Institute of Library and Information Professionals, 
Janet Seaton is a parliamentary reference-specialist who worked at the House of 
Commons Library for over 20 years. In October 1998 she was seconded’ to the 
Scottish Office to set up a research and information service for the Scottish 
Parliament. Janet took up the post of Head of Research and Information Services at 
the Scottish Parliament on December 1, 2000, and is part of the Parliament’s Senior 
Management Team. She is also a member of the Study of Parliament Group. She 
has a degree in political science, and has written articles and contributed to books 
on ‘parliament and politics. ° 

The Scottish Parliament has always seen the internet as one of the major 
mechanisms for engaging Scottish citizens in the Parliament’s business and activities. 
Its most successful initiatives have been the e-petitioning system, the webcasting of 
proceedings, the discussion forums and our MSP video diaries. Janet’s article describes 
these initiatives and assesses the prospects for future developments. 

Continuing in this Parliamentary vein, Caroline Auty has worked for five years in 
information services at the House of Lords Library answering reference enquiries from 
Peers on a variety of subjects. She is currently part-way through a year-long 
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secondment to the Parliamentary IS-IT Change Programme, a major programme that is 
seeking to set up a unified ICT (information and communication technology) service to 
cover both Houses of Parliament. Caroline writes for CIBER at University College 
London. She has had work published on different aspects of e-politics including 
political parties’ use of the internet, the web sites of London mayoral candidates and 
political hacktivism. She has also written about football fans’ use of the internet and is 
a closet football hack at heart. Her contribution this time examines weblogs of MPs, in 
particular whether they compensate for some of the commonly made criticisms of MPs’ 
web sites. The role of weblogs in fostering genuine interactivity between elected 
representative and their constituents is examined. 

Professor Richard Rogers examines the contribution of blogs to news on the 
internet and the relationship between news and the internet in general. Richard is 
a university lecturer in New Media at the University of Amsterdam, recurrent 
Visiting Professor in the Philosophy and Social Study of Science at the University 
of Vienna, and Director of the Govcom.org Foundation (Amsterdam). Previously, 
he worked as Senior Advisor to Infodrome, the Dutch Governmental Information 
Society. initiative. He earned his PhD and MSc in Science Studies at the University 
of Amsterdam, and his BA in Government and German at Cornell: University. 
Over the past five years, Rogers and the Govcom.org Foundation have received 
grants from the Dutch Government (the Ministry of Foreign Affairs and the 
Ministry of Education, Culture and Science), the Open Society Institute and the 
Ford Foundation. Rogers is author of, amongst other things, Information Politics 
on the Web (Rogers, 2004). 

Finally, Charlotte Steinmark, MA in History and Computer Science, has worked 
with the issues of electronic documentation of the public sector and its work processes 
since 1993. In 1997 she was affiliated with SLAIS for three months while working on 
her PhD thesis. The thesis investigated the legal issues of electronic documentation of 
the public sector in seven European countries including the UK, and how the 
legislation affected the cooperation between the public archives and the authorities. 
Since 2002 she has been working on the FESD project to provide a framework contract 
for the whole public sector in Denmark covering the purchase of an EDM system, 
technical and organisational consulting for implementation and organisational change. 
Charlotte has profound experience from both the private and public sectors and is 
presently working for the interest organisation Local Governments Denmark, which 
represents the local governments towards the central administration and does 
consulting work for the local governments on all issues. Charlotte describes her 
involvement with the FESD project in this case study and details the benefits of a 
mutual procurement framework for the public sector in Denmark. 


Caroline Auty 
CIBER, School of Library, Archive ang Information Studies, University College London, 
London, UK 


Note 


1. As this issue went to press, William Hill quoted odds of 1/20 on May 5, 2005 to be the date of 
the next General Election. 
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Abstract 


Purpose ~— To review the challenges associated with ensuring the capture and preservation of and 
long-term access to government records and publications in the digital age and to describe how 
libraries and archives in Australia are responding to the challenge. 
Design/methodology/approach — Literature- and case-study-based conceptual analysis of what 
makes government online information so vulnerable and initiatives at the National Library of 
Australia and the National Archives of Australia. 

Findings — Democracy, governance, consultation and participation all depend on the availability of - 
authentic and reliable information. Government agencies as well as educational and research 
institutions are prcducing increasingly large volumes of information in digital formats only. While 
Australia has done more than most countries to date to address the need to identify, collect, store and 
preserve government publications and public records in digital formats, large amounts of information 
are still at risk of loss. 

Research limitations/implications — Focuses on circumstances and initiatives in the Australian 
Government. 

Practical implications — Librarians and archivists need to become more proactive in influencing 
the behaviour of government agencies to ensure that impcrtant evidence of democratic governance is 
created and managed in ways that facilitate their accessibility and long-term preservation. 
Originality/value - Emphasises the vital role that information management agencies such as 
libraries and archives have to play in supporting transparent and accountable governance in the 
digital age, and explores innovative strategies for ensuring the long-term preservation of this 
important documentary heritage material for the use of future generations. 

Keywords Archiving, Digital libraries, Document management, Government, Publications, 

Records management 

Paper type Case study 


There is no political power without control of the archive, if not of memory. Effective 
democratisation can always be measured by this essential criterion: the participation in and 
the access to the archive, its constitution and its interpretation (Derrida, 1996, p. 4, n. 1). 
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disks, network drives, CDs, DVUs, online on the internet and intranets and in electronic 
mail. All of these formats contain information that may be important sources of 
memory and evidence for e-government and. e-democracy. However, it is the 
phenomenon of the internet, and more partictilarly the worldwide web developed i in 
1993, that has revolutionised the publication and distribution of information in digital 
form and has the greatest potential to impact, for a: variety of reasons, some positive, 
some negative, on the ability of citizens to participate in democratic government ina 
digital world. 

This paper focuses on information made available online via the internet and 


` intranets and the role of libraries and archives in collecting, managing and preserving 


it for long-term access. “Electronic”, “digital”, “online” and “born digital” are all terms 
used in the literature to refer to information made available in this manner. In this 
paper the term “online” is used to refer to information made available via the internet 
(including the web) and intranets. “Digital” is used to refer more generally to the 
broader range of formats. 


Definition of an online publication 


A publication is information, regardless of its format or method of delivery, that is: 
made available to the general public, or to an identified public, either free of chargeor - 
for a fee. In theory this includes everything publicly available via the internet. In- 


practice, however, the National Library of Australia and its partners who contribute to 


PANDORA: Australia’s Web Archive (National Library of Aus‘ralia, 1996) selectively ` 
collects only certain types of online publications, usually only those without print. 
equivalents. These include journals and other serials, research papers, conference . 


proceedings, and web sites or.parts of web sites, which provide substantial or unique 
information about a topic, organization or person[]]. 


Definition of an online record 


A record is defined as “information created, received and maintained as eee and 


information by an organisation or person, in pursuance.of legal obligations or in the 
transaction of business” (International Standards Organization, 2001, p. 3)..In the 
online world, this includes e-mail sent within an agency and between an agency and 
other organisations and individuals. It also includes web sites on intranets and the 
internet. Publications are also records. Those Australian government publications that 


are not archived by the National Library of Australia should be transferred to the 


National Archives of Australia by the creating agency. 


These definitions reveal the overlapping boundaries between publications and’ 


records in the online world, and indicate the overlapping responsibilities and need for 
close cooperation between the National Library, National Archives and the creating 
agency if government information is to be kept accessible now and in the future[2]. 
While computer scientists in Australia had access to e-mail and FTP from ‘the late 
1970s (Clarke, 2004, section 4.2, paragraph 3), and staff of the CSIRO and academics 
within Australian universities mostly had access to the internet via AARNET by May 


1990 (Clarke, 2004, section 4.4, paragraph 2), most Australians had to wait until the.. 


mid-1990s for access following the explosive impact of the worldwide web. By this 
time, when the National Library-and the National Archives began to consider their 


a 
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responsibilities in relation to collecting and preserving online information, the 
enormity of the task and many of the risks involved were already apparent. 


E-government and the role of libraries and archives 
In April 2000 the Australian Government issued “GovernmentOnline: The 
Commonwealth Government’s Strategy”, which stipulated that: 


All new non-commercial publications released by a Minister or agency must be made 
available online concurrently with other forms of dissemination from 1 June 2000 (Australia, 
Department of Communications, Information Technology and the Arts, 2000). 


While many Australian government publications continued to be published in print, as 
well as online, the volume of online government publishing expanded rapidly from this 
time onwards. An increasing number of titles are produced only online, and this 
situation is being replicated in all States and Territories. i 

It is the role of the National, State and Territory libraries (the deposit libraries) to 
document the published heritage of Australia. By collecting, cataloguing to the 
National Bibliographic Database (NBD)[3] and preserving publications, these libraries, 
as well as other research libraries, such as the university libraries, enable researchers 
to find out what information has been published and to gain access to it. Unless these 


libraries are informed about what has been published online and have sufficient - 


resources to deal with it, a substantial body of Australia’s published heritage, 
including vital government publications, will be difficult to identify and locate now, 
and will be lost altogether to future generations. In fact, the publications of some 
Australian government agencies are already difficult to locate, being moved willy-nilly 
from one site to another, as agencies change name and functions are shifted from one 
department to another. 

In addition to being an important source of national and cultural memory and places 
of scholarly research, archives in democracies are meant to help protect the rights and 
entitlements of the governed. Archives provide a means of democratic accountability, a 
means of empowering citizens against potential maladministration, corruption and 
autocracy. In the words of John Fleckner (1991, p. 13), archives are bastions of a just 


society where “individual rights are not time bound and past injustices are reversible”, | 


where “the archival record serves all citizens as a check against a tyrannical 
government”. For these noble objectives to be achievable, evidence of government 
decisions and activities, including government online activities, needs to be captured 
and preserved in an accessible form for as long as it is needed, in some cases 
indefinitely. 

In the interests of accountable government and for the benefit of the community the 
National Archives of Australia promotes reliable recordkeeping and maintains a 
visible, accessible and known collection. To this end, the Archives enables and 
promotes best practice in the management of government records in all formats from 
the point of creation for as long as they are required to support the needs of 
government and the people. The importance of National Archives standards and 
guidelines as enablers of successful e-government has been recognised by the 
Australian Government, which has made compliance with relevant Archives policies 
and standards mandatory under the Government Online Strategy and its successor, the 
e-Government Strategy. 
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Thomas Jefferson is widely quoted as having said “Information is the currency of 
democracy”[4]. To be able to evaluate the success or otherwise of government 
programs, to be able to assert their rights, to debate the issues of the day, citizens must 
have access to current and historic sources of information generated by government 
and by other sectors such as academia and private enterprise. In an online world, this 
means that access to information created and distributed by digital means is essential. 
Yet online information is known to be ephemeral. It has been estimated that web pages 
have an average life of 44 days (Library of Congress, 2003) although, in the experience 
of the National Library of Australia, high quality publications of research value 
usually remain available on the publisher’s web site for much longer. 


What makes online information so vulnerable? 
There are technical, organisational, legal, financial, and political re reasons for the 
vulnerability of online information. 


Technical factors ; 
For libraries and archives with the obligation to keep information indefinitely, the 
preservation of digital information poses a significant challenge, and to date there are 
no proven reliable long-term methods. The hardware and software that the creators of 
electronic publications and organisational records use to produce and store their digital 
works changes over time, in most cases quite rapidly. 

Dack, a consultant engaged by the National Library to undertake a risk assessment 
of its digital collections explained the problem this way: 


Digital information is not inherently human readable; it is dependent on machines for its 
creation and use. Made up of patterns of binary signals recorded on various media, it requires 
software, and sometimes specialised hardware, to interpret the logical structure of these 
physical patterns that record the data in order to deliver the conceptual object in a form that is 
understandable to the user (Dack, 2004, p. 1). 


Dack points out that while it is relatively easy to preserve the physical representation 
of the data, the “bit stream”, this does not guarantee the long-term preservation of the 
conceptual information or the content of the digital objects. This requires retaining the 
ability to interpret the physical and logical structure of the obj ect by means of software 
and an appropriate hardware environment: 


It is the complexity, ongoing nature and cost of the task of preserving.the ability to interpret 
and render the information indefinitely into the future that is the biggest risk to the digital 
collections (Dack, 2004, p. 1). 


Information that is available to us today will not be available to us in perhaps even ten 
years’ time, unless we have strategies in place to actively collect, describe and preserve 
it. These strategies are labour-intensive, time-consuming and costly, and need to be 
applied from the moment of creation of the materials, and then continued indefinitely. 
The need for long-term access to important material, such as government publications - 
and records, should be actively: taken into account at, or even before, the time of their 
creation to minimise the complexity and cost of preservation. 
As Cordeiro (2004) points out, it is not just the transient nature of digital formas 

but the scale of the preservation problem as well, given the diversity of physical and 
logical information representations and in terms of the volume of the information. 


a 











Organisational factors 

Producing and collecting online information requires entirely new infrastructure, 
policies, procedures and staff skills. In the period of transition from print to online, in 
which we will remain for the foreseeable future, organisations have been burdened 
with the need to support both print and online. 

Retention of online information for future access depends on complex human and 
technical systems to manage the creation and storage of organisational records, the 
contents of organisational web sites, and other online publications. It requires 
organisational commitment from the top down, and staff with the knowledge and skills 


` to apply them and the will to comply with them. 


Some publishers, including government agencies, actively resist the retention of 
historical information, arguing that it can be misleading. While it is certainly: necessary 
to manage out of date information so as not to mislead, there are-effective ways of 
doing this and there is no valid argument for denying access to this information for 
future research purposes. 

Organisations that own information are impermanent. They restructure their web 
sites, have name. changes and, in some cases, are abolished, split up or are 
amalgamated with other organisations. An organisation may cease a particular 
function or transfer it elsewhere. In universities and research centres, funding for a 
research program comes to an end, researchers relocate or retire and the results of the 
research are left in electronic limbo (Nicholls, 2004). In each of these situations, there is 
the risk of loss of access to organisational records and published information, unless 
transfer to a responsible archives or library is negotiated and implemented. 

The trend towards devolution of responsibility for publishing within government 
departments, as well as the outsourcing of both the creation and publication of 
government information, has made it much more difficult for the agencies themselves, 
for libraries and members of the public to find out what is being published. Although 
this trend began before online publishing became such a potent force, online publishing 
has exacerbated the difficulty of identifying, describing, collecting and preserving the 
published output of individual agencies and government as a whole (LeF urgy, 2001). 


Legal factors 

In a paper delivered to the Archiving Web Resources Conference (November 9-11, 2004 
in Canberra, Australia), Malcolm Gillies addressed the question of why archiving 
information published on the web is essential and, in particular, why archiving 
government information is important: 


“Way?” must also take on a more legalistic answer in this age of e-government. With a 
responsible government it is not just the matter of what librarians or archivists might like 
digitally to collect. Many government digital presentations have to be preserved, in an 
authentic form, because of their legal status or obligations of citizen access. Jf the Web is 
becoming the leading form of interaction between governors and governed — maybe soon we 
shall even be voting over the Web — then much more emphasis on preservation, consistency 
of collection and confronting of technological obsolescence will be needed (Gillies, 2004). 


When information is published in print, the legal deposit provisions of the Australian 
Copyright Act 1968 oblige the publisher to deposit a copy of each title with the 
National Library. Equivalent State legislation alsc requires deposit in the State or 
Territory library of the State or Territory of origin. 
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The situation is much less regulated in the case of electronic publications. The 
Commonwealth, States and Territories have varying legal requirements for electronic 
publications in different formats. The legal deposit provisions of the Commonwealth 
Copyright Act 1968 do not apply to electronic publications at all and, of the States, only 
Tasmania and South Australia have legislation that the State libraries are confident 
encompasses online publications. In the Northern Territory, the Publications (Legal 
Deposit) Bill 2004 has been passed in the Legislative Assembly and the Act will 
commence on March 1, 2005. Under this Act the Northern Territory Library will 
assume an obligation to collect, store, preserve and provide access to all material 
lodged from Northern Territory authors, printers and publishers, including online 
publications and web sites (Northern Territory of Australia, 2004). 

Under the New South Wales Premier’s Department Memorandum 2000-15 (New 
South Wales, Premier’s Department, 2000, p. 3), “the location of new networked 
electronic publications, such as those publicly accessible on Web sites, must be notified 
to the State Library”. It is interesting to note that the New South Wales Memorandum 
emphasises the value of government information and the key role of libraries, 
especially National, State, local public and government agency libraries, in providing 
public access to government publications. Often, however, it is extremely difficult for 
libraries to find out what online government publications are available and it is not 
unusual for librarians in government departments, both State and Commonwealth, to 
lament that they do not know the totality of what their own departments are 
publishing. 


Governments of other Western nations have been more enlightened than the’ 


Australian Government in seeing the necessity of legislative support for the national 
library in collecting online publications and web sites. In Sweden in May 2002 the 
government issued a special decree authorising the National Library not only to collect 
Swedish web sites but also to allow the public access to them within the library 
premises (National Library of Sweden, 2002). 

In the United Kingdom a piece of “enabling legislation”, the Legal Deposit Libraries 
Act 2003 (United Kingdom, 2003) extends the principle of legal deposit to electronic 
and other non-print publications, although it is expected to be several years before the 
necessary regulations are in place to enable the deposit of online publications and web 
sites. 

The National Library of New Zealand (Te Puna Matauranga o Aotearoa) Act 2003 
came into force on May 5, 2003 and extends legal deposit to include electronic 
publications, including online publications and web sites (National Library of New 
Zealand, 2003). In addition, in 2004 the New Zealand Government granted the National 
Library NZ$24 million to ward off “digital amnesia”, and enable it to implement plans 
for a trusted digital repository (National Library of New Zealand, 2004). ; 

From a record-keeping point of view the following pieces of Australian legislation 
are all relevant to government online activity: 


* Archives Act; 

> Privacy Act{5] 

* Electronic Transactions Act{6], 
* Evidence Act; 

* Freedom of Information Act; 
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* Public Service Act; and 
* Financial Management and Accountability Act. 


Together, this legislation promotes accountability, protects privacy and confidentiality 
and allows the Australian community access to our documentary heritage. 

The Archives Act 1983 provides for public access to Australian Government 
records over 30 years old regardless of their format while protecting sensitive personal 
information, and indeed other categories of sensitive information. The Act balances the 
public interest in the workings of government with the need to protect sensitive 
information. 

The Freedom of Information Act 1982 permits access to Australian Government 
records less than 30 years old in certain circumstances while protecting sensitive 
information. It also gives the public the right to request the correction of personal 
information about themselves in government records if it is incorrect, incomplete, out 
of date or misleading. 

With the exception of disposal authorisation powers given to the National Archives 
under Section 24 of the Archives Act, the National Archives has no power to require 
Australian Government agencies to comply with the best practice record-keeping 
standards, policies and guidelines that have been issued as a part of the 
- “e-permanence” campaign. While the mandating of some of our standards and 
guidelines under the Government Online Strategy has helped, compliance with these 
requirements has been uneven, with monitoring based largely on agency 
self-assessment and reporting. More recently, however, the need for agencies to 
adopt best practices recommended by the Archives has been reinforced by the 
Australian National Audit Office, which has conducted a variety of audits of agency 
record-keeping and online activity (Australian National Audit Office, 2002, 2003). 

The legal situation in the Australian Government contrasts with that of a number of 
other jurisdictions in Australia, which have recently passed archives and records 
legislation that requires agencies to comply with record-keeping standards and 
guidelines issued by the archival authority. Examples here include South Australia, 
New South Wales, Queensland, Western Australia and the Australian Capital 
Territory. In part, these legislative developments in other jurisdictions reflect the need 
for legislative change to ensure that governments manage efficiently, accountably and 
reliably the new realities and imperatives of digital information creation, transmission 
and storage. 


Financial factors 

Shelby Sanett (2002) canvasses the various elements that contribute to the cost of 
building and maintaining a digital archive. These include data selection and 
evaluation, data capture, data storage and management, resource description and 
discovery, data use, data preservation, and rights management. Both direct and 
indirect costs must be taken into account. 

“While the costs associated with ensuring long-term access to digital information 
are difficult to predict quantitatively, they are generally agreed to be significant” 
(National Library of Australia, 2004). And there is general agreement that preserving 
complex digital objects over time will be more expensive and more intensive than 
preservation of traditional library materials, for instance (Russell and Weinberger, 
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2000). As long as paper-based materials are stored in appropriate conditions of 
temperature and humidity, and are kept free of pests, in most circumstances they will 
last for hundreds of years and remain available for access. This is not the case with 
digital materials, which must be actively managed and preserved through complex and 
resource intensive processes over a long period of time. 

Implementing new systems for the creation, management and preservation of online 
records and publications is coszly. The National, State and Territory libraries have 


‘been granted no additional funding to undertake collection and preservation of online 


publications and the resources available are insufficient. As a result important 
publications, including government publications, are not being preserved for the 
future. 

Similarly, while responsibility for making and keeping full and accurate records of 
government online activity is shared between the Archives and the creating agencies, 
neither has been provided with additional funds to meet these new responsibilities. 

Both the National Archives and the National Library have in recent years created 
entirely new branches and initiatives devoted to developing policy and practice for the 
collection, retention and preservation of online publications and records. The resources 
to support all of these new activ:ties have had to be taken from other more traditional 
areas of the Archives and Library operations. 


Political factors 

Governments change, policies change. Information previously available can embarrass 
a government, a party or an individual. Online information can easily be changed, 
accidentally, intentionally, even maliciously. The National Library has learned to 
gather the web sites of governments and political parties very quickly after a general 
election. Especially where there has been a defeat, change in content will be swift. 


The current situation in Australia: initiatives at the National Library and 
National Archives 

Although there are still many areas of concern, Australia has done more than most 
countries to date to address the need to identify, collect, store and preserve online 
publications and organisational records. Both the National Library of Australia and the 
National Archives of Australia have earned international reputations for their 
initiatives in this area. 


National Library of Australia 
The primary role of the National Library of Australia is to acquire and maintain 
comprehensive collections of documentary material relating to Australia, to preserve 
them, and to make them available for research now and in the future. In the mid-1990s 
it became apparent that this role must include the collecting of online publications too. 
As a result the Library established PANDORA (Preserving and Accessing Networked 
DOcumentary Resources of Australia): Australia’s Web Archive (National Library of 
Australia, 1996) with the aim of collecting and preserving nationally significant online 
publications, and the process of developing policy, procedures and a technical 
infrastructure began. 

Nine years on PANDORA is now a selective archive of over 7,000 titles, many of 
which are re-gathered on a regular basis to update content. Its partners, the National 





Library of Australia, all of the mainland State libraries, the Northern Territory Library, 
ScreenSound Australia, the Australian War Memorial, and the Australian Institute of 
Aboriginal and Torres Strait Islander Studies, all identify, select, describe and archive 
online publications and web sites that fall into their own area of collecting 
responsibility and add them to the Archive. In the case of the National and State 
libraries, a primary area of collecting responsibility is their jurisdictions’ government 
publications. 

' The National Library selects titles for archiving on the basis of selection guidelines 
(National Library of Australia, 2003), which give priority to six categories of 
publications: l 


(1) Commonwealth and Australian Capital Territory government publications; 
(2) publicaticns of tertiary education institutions; 
(3) conference proceedings; 
(4) e-journals; 
(5) titles referred by indexing and abstracting agencies; and 
(6) topical sites: 
* sites in nominated subject areas that will be collected on a rolling three-year 
basis; and 


e sites documenting key issues of current social or political interest, such as 
election sites, the Sydney Olympics, and the Bali bombing. ` 


In building the Archive the National Library and its partners are aiming to collect and 
preserve all Australian online publications that have individual research value, as well 
as a broad sample of publications and web sites that collectively provide information 
about Australia and Australians and how they use the internet. 

Incomplete coverage in the Archive of publications from all levels of government 
remains a grave concern. Government agencies in Australia are publishing 
information online at a far greater rate than the National and State libraries can 
archive them, using the labour-intensive selective methods currently at our disposal for 
description and collection. To improve coverage. more automated methods are 
essential. 

As a result, in 2003 the National Library commenced the first stage of the 
Commonwealth Metadata Pilot Project, now called the Australian Government 
Metadata Project. There are two aspects to this project: 


(1) finding out what has been published and getting records (metadata) for it into 
the National Bibliographic Database (NBD); 


(2) extracting information about online government publications from the NBD 
and batch loading it into the PANDORA Digital Archiving System, to automate 
or semi-automate the harvesting of government publications. 


In 2003, seven Australian Government agencies of various sizes commenced the project 
to look at the metadata that is being created to describe their publications and to 
ascertain how to collect this metadata and convert it into a form compatible with the 
National Bibliographic database. In 2004, another seven agencies joined the project, 
while one of the original agencies dropped out. 
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There is no standard approach by government agencies for creating, describing and 
organising their publications, and each situation is different. The Library cannot work 
with hundreds of individual approaches, but it can work with a dozen or so. The aim of 
this project is to define a small number of models that the Library can support to 
receive metadata about government publications and then to promulgate these models 
to government agencies. Not only will this assist libraries and the public but it will also 
assist agencies to meet their responsibilities for keeping government information 
accessible to the community. This project will continue in 2005, and the first State 
government agencies are likely to participate. 

There is great value to the Australian community in having as wide a coverage of 
Australian publications as possible in the National Bibliographic Database. Catalogue 
records can be re-used by other libraries, such as university libraries, and they can be 
used for other resource discovery purposes, such as the federated discovery services 
now being developed. 

The second aspect of this Australian Government Metadata Project, automating 
the tasks associated with gathering of online government publications remains a 
significant challenge and depends upon the development of the PANDORA Digital 
Archiving System to enable it to batch load and process data. This work commenced 
in 2004 but was interrupted by the need to re-engineer the system to improve its 
stability. Its development to accommodate batch loading and processing will resume 
in 2005. 

The availability of government information is crucial to the democratic process, 
but so is information from other sources, for instance, research conducted into areas 
of health, the environment, and education, within our universities. As with 
government agencies, within universities there has been little coordination of online 
publishing and usually no awareness by the university as a whole or its library of 
what is being published. While government and university libraries alike have 
traditionally kept archives of print publications, this practice has not yet- evolved in 
most cases for online publications. This is starting to change. Seven Australian 
universities have now set up e-print archives into which academics can deposit copies 
of their research papers. 

In 2003 the Department of Education, Science and Training issued a call for funding 
bids to improve research information infrastructure in Australia. Four projects are to 
be supported and will be guided by the Australian Research Information Infrastructure 
Committee (ARIIC). The National Library is a formal partner in three of these projects. 
Project ARROW (Australian Research Repositories Online to the World) is being led by 
Monash University, with objectives to identify and test a software solution for digital 
repositories of research information, and to develop a national resource eve. 
service (Arrow Project, 2004). 

The Australian Partnership for Sustainable Repositories project is being led by the 
Australian National University and is developing demonstrator repositories and 
support continuity and sustainability of digital collections, including research datasets 
(Australian National University Scholarly Technology Services, 2004). 

The Meta Access Management System, led by Macquarie University, is developing 
common services for authentication, authorisation and digital rights management, 
supporting access to the national network of institutional repositories (Macquarie 
University, 2004). . 
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National Archives of Australia 

The National Archives of Australia is committed to promoting full and accurate digital 
recordkeeping by Commonwealth agencies. The Archives provide appropriate advice 
to agencies and the Archives Act ensures that records in the Archives care are 
managed securely and effectively. 

The Archives has endorsed the International Standard for Records 
Management, ISO 15489-2001, and before that its predecessor AS 4390-1996, and 
promotes the adoption of this Standard in the Australian Government through the 
“e-permanence” suite of best practice recordkeeping standards, manuals and 
guidelines (National Archives of Australia, n.d.). The e-permanence web site was 
released in March 2000. 

The Australian community needs to be confident that these records, while in the 
custody of the agency that collected or created them or with the Archives if they are 
assessed as being of enduring value, will be secure, retain their integrity and are 
accessible for as long as they are required. Much of the effort of the National Archives 
in promoting its e-permanence standards and strategies to agencies is directed towards 
helping agencies to design and implement recordkeeping systems that ensure the 
making and keeping of records with these characteristics. While this regime of best 
practice standards and guidelines is meant to apply to all records, regardless of 
whether those records are made and kept in paper or in electronic form, the growth of 
electronic recordkeeping has been a particular concern of the Archives and is the 
subject of a number of specific standards and guidelines issued by the Archives in 
recent years. 

The foundation of the e-permanence suite of products is the DIRKS Manual, 
“DIRKS: A Strategic approach to Managing Business Information” (see www.naa.gov. 
au/recordkeeping/dirks/summary). This manual outlines a detailed eight-step process 
model for the design and implementation of recordkeeping systems consistent with 
Clause 8.4 of AS ISO 15489. 

The DIRKS Manual also incorporates procedures for identifying the need for 
and appropriate retention periods for public records. Section 24 of the Archives 
Act provides that, in most cases, public records cannot be disposed of or 
destroyed without authorisation by the National Archives. Such authorisation is 
given by way of records disposal authorities issued by the National Archives 
under the terms of its legislation. 

Another critical standard available on the e-permanence web site is the 


“Recordkeeping Metadata Standard for Commonwealth Agencies” (see www.naa. 


gov.au/recordkeeping/control/rkms/summary.htm). This standard outlines the 


metadata that record-keeping systems should capture and retain in order to ensure. 


the authenticity, reliability, integrity and usability of records. It includes, for example, 
metadata elements for controlling, documenting and managing access to and use of 
records. | 

In April 2000 the Government Online Strategy, led by the National Office for the 
Information Economiy, mandated some e-permanence standards, policies and 
guidelines as essential enablers of the Strategy. These included: 


* the AGLS Metadata Standard for online resource description and discovery (later 
published by Standards Australia as AS 5044-2002; see www.naa.gov.au/ 
recordkeeping/gov_online/agls/summary.html); and 
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+ “Archiving Web Resources: Policy and Guidelines for Keeping Records of 
Web-Based Activity in the Commonwealth” (see http://naa.gov.au/ 
recordkeeping/er/web_records/intro.html) (National Archives of Australia, 2001). 


In November 2002, the e-permanence group of standards was endorsed in the 
Government’s E-Government Strategy “Better Services, Better Government”, which 
updated and replaced the Government Online Strategy. In the same year, the 
e-Government unit of the UK Cabinet Office released its own guidelines for archiving 
web sites (United Kingdom, Cabinet Office, e-Government Unit, 2004), which reference 
earlier guidelines issued by the UK National Archives (The National Archives, 2001) 
that are broadly comparable to those previously issued in Australia[7]. 

Since 2000 the e-permanence suite of standards and guidelines has been regularly 
supplemented, revised and updated. In 2004, for example, the Archives released 
“Digital recordkeeping: guidelines for creating, managing and preserving digital 
records” (National Archives of Australia, 2004). The guidelines provide a 
comprehensive overview of all issues associated with making and keeping digital 
records and provides numerous links to related e-permanence products where more 
specific advice can be found. 

In relation to the long-term preservation and secure ‘storage of digital Australian 
Government records, the Archives has announced its AtoR (Archives to Researcher) 
digital preservation project. The Archives has looked at other Australian and 
international approaches to this issue. Our research shows that the best strategy for 
preserving digital records is to standardise them into a stable long-term archival file 
format. 


The National Archives approach to preserving digital records uses spndaidieal 


XML (eXtensible Markup Language). Records such as e-mails, web pages, and 
word-processed documents created in commercial software programs are converted 
and stored in a stable long-term non-proprietary XML form (Heslop et al., 2002). This 
enables records to be read with computers now and into the distant future. Where 
government records need to be kept for a long time they can be transferred to the 
National Archives for storage in its secure digital repository. 

The National Archives has developed its own open source software, called Xena 
(XML Electronic Normalising of Archives), to convert digital records into a format that 
will be stable in the long term. Xena software is available free of charge on the 
Sourceforge web site (see www.sourceforge.net/projects/xena), 


Factors inhibiting the long-term availability of online information for 
e-governance 
Publications 
_* There is still a low level of recognition among creators of information, publishers, 
government agencies and politicians of the importance of keeping online 
information accessible for future generations. 
* Absence of legal deposit legislation covering online publications at the Federal 
and State levels means that collecting and preserving them remains piecemeal 
and inefficient. 





Lack of standardised approaches to publishing government information and 
reporting its availability complicates the task of the National and State libraries 
whose responsibility it is to collect it. 


Lack of quality metadata, even in the Australian Government sector where use of 
AGLS (Australian Government Locator Service) metadata is mandated, inhibits 
discovering what publications and web sites are available and gaining access to 
them. 


Lack of a coordinated national strategy for retaining important national heritage 
in online formats means that information is being lost. 

Lack of adequate funding for collecting online publications and web sites means 
that there are already serious gaps in what has been archived so far. 

Lack of adequate funding for preservation means that what has been collected is 
at risk of loss as technology changes. 

We:still lack knowledge and the technical wherewithal to archive some types of 
publications and web sites. Because technology will keep evolving, this is likely 
to be an ongoing problem. 


Archives 


Absence of a strategic whole-of-government framework for managing the digital 
information resources of the Australian government. 


Shortage of staff in Archives and government agencies with broad digital 
record-keeping, strategic awareness, and skills. 


In the high-pressured fast-moving, high information-volume world of modern 
government, good record-keeping gets squeezed out — especially if staff do not 
have the training and systems that are needed to enable good record-keeping to 
be an organic part of the workflow. 


The focus on the short-term and the here-and-now means that insufficient 
attention is given to future needs for evidence, accountability and documentary 


` heritage. 


Software packages and today’s workplace culture tend to reinforce the view tha 
government information is the personal property of individuals rather than the 
view that public records are public property created and held in trust for the 
people of Australia. 


A perception that recordkeeping is a corporate overhead that should be 
minimised or eliminated rather than an investment in the corporate memory and 
knowledge assets of the organisation that can support better, more informed 
decision-making and that will ultimately save the organisation money and make 
it more efficient. 


A perception that good record-keeping can be guaranteed if one buys the right 
software, without realising that even the best software will fail if it is not 
implemented properly and if record-keeping change management is not taken 
seriously. 
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Conclusion 

Keeping information in electronic formats available for e-governance and e-democracy 
is a public good, just as health services, education and bridges are. In fact, increasingly 
in future, the provision of other public goods will depend on the long-term availability 
of information in electronic formats, and this must be recognised and the cost of it must 
be borne by society as a whole. 

At this time in Australia, despite the pioneering work of the National Library and 
National Archives, information that is critical to e-governance and e-democracy is in 
danger of being lost. To avoid serious gaps in the record of government business, to 
avoid loss of information often paid for by the taxpayer, there is an urgent need for 
politicians, heads of government departments and other key policy-makers, educators, 
publishers, academics and the general public to realise the magnitude and the gravity 
of the situation before us. We must find the collective will and adequate resources to 
implement policies and practices that will safeguard our online heritage and keep it 
safe for the Australian people, now and in the future. 

This is not to say that the situation in Australia is any worse than it is in any other 
country in the world. On the contrary, Australia is better positioned than most to move 
quickly, if the political will is there. But we cannot justify losing our online heritage in 
this incunabular period for online information because everyone else is doing it too. We 
have guidelines, procedures, technical infrastructure and expertise, which other 
countries are seeking access to, for the collection and preservation of both publications 
and organisational records. 

Australia is still seriously disadvantaged in lacking legal deposit provisions that 
cover electronic publications. Countries such as the United Kingdom and New Zealand 
have recently addressed this deficiency by introducing amended legislation to grant 
the national library the right to collect and preserve web publications without the need 
to seek the permission of individual publishers. Such legislation must, of course, 
contain balances to safeguard the rights of publishers, as the current legislation for 
print publications does. It is essential that the principle of preserving one copy of every 
Australian publication for access by current and future generations be extended to 
electronic publications. 

A national strategy for the collection and preservation of online publications is 
required, encompassing government, education, commercial and private sectors. This 
strategy must assign responsibilities, where they have not already been assumed, for 
collecting and preserving the records and publications of the various sectors. It must be 
supported by appropriate legislation and funding. 


Notes 
1. For a full list of types of online publications collected, see National Library of Australia 
(2003, Section 3.4). 
2. These responsibilities are defined in National Library of Australia, National Archives of 
Australia, National Office of the Information Economy (2002). 


3. The National Bibliographic Database (NBD) records the holdings of approximately 1,100 
Australian libraries and is hosted by the National Library of Australia, with access through 
Kinetica: Australia’s Library Network (see www.nla.gov.au/kinetica/, January 7, 2005). 

4, While Thomas Jefferson is widely quoted as having said “Information is the currency of 
democracy”, when it comes to locating an actual citation for it, one cannot be found. It 








appears that he may never have said these exact words, although he did write things that 
mean much the same. See Coates (n.d.). 


. In relation to Commonwealth records, the Privacy Act covers those records that are less than 
30 years of age. Privacy matters relating to Commonwealth records that are older than 30 
years are administered under Section 33 of the Archives Act. 


6. The Electronic Transactions Act 1999 has altered the way in which retention requirements 
relating to the form of certain records can be met. This does not, however, remove the 


on 


obligation on agencies to obtain the permission of the National Archives for the disposal of. 


records under the Archives Act. Sub-section 12/2) of the Electronic Transactions Act 
provides that an electronic version of a document can satisfy a requirement under a 
Commonwealth law to retain, for a particular period, a document that is in the form of paper, 
an article or other material. This is subject to a number of integrity, accessibility and 
usability requirements being met. According to the Australian Government Solicitor, the 
Electronic Transactions Act does not operate as an authorisation for an agency to destroy a 
document in the form of paper, an article or other material if it holds an electronic form of 
that document. The general prohibition on the disposal of all Commonwealth records under 
sub-section 24(1) of the Archives Act still applies, and destruction or other disposal may only 
take place in accordance with sub-section 24(2) advice provided to the Archives by the 
Australian Government Solicitor on the subject “Relationship between Archives Act 1983 
and the Electronic Transactions Act 1999”, August 5, 2002. 


7. See also a report issued by the Smithsonian Institution Archives on archiving Smithsonian 
web sites (Smithsonian Institution Archives, 2003). 
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Abstract 7 

Purpose ~ To examine the effects of technical developments on demand for traditional enquiry 
services in the House of Commons Library. 

Design/methodology/approach ~ Trends in enquiry load are matched against technical advances, 
especially in the area of user self-service, using published and unpublished reports. 

Findings — The growth of resources delivered via the Parliamentary intranet, and the provision of 
suitable and convenient retrieval equipment, have enabled the end-user and sigmincantly reduced 
demands on traditional librarianship and reference skills. 

Research limitations/implications ~ Based on experience of one special library. 

Practical implications — Likely to be of use to information practitioners in cognate ‘situations 
where traditional approaches are being supplanted by technical change. 

Originality/value — Case study of an organization adapting to the new realities of an 
information-rich corps of users. _ 
Keywords Parliament, Document management, User studies, Self-service, Electronic document delivery 


Paper type Case study 


The eclipse of the information worker and the rise of the end user have many times 
been mooted, not least of all by the present author (Pond, 2001). How is this contention 
borne out by the usage of the House of Commons Library by a well- resourced and 
demanding clientele, Members of Parliament and their staff? 


Background 

In 1995, the Library provided 89 reader places, plus 20 easy chairs for Member use. 
There were approximately 35,000 books in the main suite, not counting reserve stocks, 
395 current periodical and 90 newspaper titles were displayed, and there were 52 
stationery racks, three post-boxes, and a teletext set. There was one PC in the 
Members’ library in the Palace of Westminster set aside for Member use. Its ostensible 
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function was to allow Members access to POLIS, the Library’s database, so they could 
check on what books were in stock, and it was inaugurated in November 1995. 

A library service to Members’ staff (their secretaries and researchers) was provided 
in the Derby Gate Building, about five minutes’ walk north of the Palace of 
Westminster. In 1995, there were three reading rooms, providing seats for 55 users, and 
about 5,000 reference books, plus official documentation. Three PCs for access to 
POLIS and other databases were installed in 1993, with word-processing, Outlook, and 
other MS Office functions barred when, in March 1996, the Librarian ruled “firmly 
against Library PCs being used for anything other than Library services”, meaning 
e-mail and word-processing in particular (House of Commons, 1996b). This followed 
general indications.from the Information Committee that Library user PCs were not to 
be used for office duties. 

In 1992/3, 25,000 enquiries were recorded in the Members’ Library Reference 
Services Section Annual Report (Library Reference Services Section, 1993, Appendices 
B and C) and 33,000 in the Derby Gate Library (Public Information Office, 1994, 
p. 11)[1]. These may be taken as the pre-electronic norm, though they were recorded on 
a different basis in the two buildings. This year’s figures were affected by the spring 
election in 1992, but not to a marked extent. Increases (commonly of 5 or 6 per cent) 
were recorded year upon year in the late 1980s and early 1990s. The result of this was 
that the service deteriorated unless more efficient working practices were adopted, or 
extra staff were employed. In the mid-1990s, in order to preserve some sort of primacy 
for Member enquiries, and to cope with the increases, their staff had been diverted ever 
more systematically, both in person and on the telephone, to the Derby Gate Library. 


The early history of electronic provision 
Prestel and other reference information sources 
Prestel was the first external electronic information system to be made generally 
available in the Library, though there had been some abortive experiments dating back 
to the 1960s. Prestel started in 1978, and was a viewdata system designed to allow 
access to data mounted by information providers (IPs), both of a commercial and public 
service nature. Charges were a monthly rental, a call charge, and frame charges levied 
by some IPs. The Library, as an IP, had acquired two Prestel sets; they were used only 
incidentally for information retrieval. Attempts had been made by various 
organisations to popularise Prestel, notably the Nottingham Building Society, with 
its Homelink svstem (whereby the author paid his gas bill on Christmas Day 1982, 
much to the amusement of two of his relatives, who had been born in the 19th century), 
but the system was undermined by poor marketing, unreliable communications, and 
high charges. In France, the Teletel system, with similar technology, had 3,000,000 
users, largely because the authorities gave away terminals to replace telephone 
directories. Something which built on these concepts, but which was much more akin 
to the internet, was a service run by IBM called Prodigy in the USA — it was free at the 
point of use, save for the cost of the call, for a global subscription of $10 a month (The 
Times, 1989). Members were used to the outwardly similar teletext services, Ceefax 
and Oracle, but only a few acclimatised to Presiel. 

In 1986, a BT Tonto terminal had been installed, “but there has been little or no 
demand from Members” (Library Reference Services Section, 1986, p. 12). The Merlin 
Tonto was a customer terminal which combined, for the first time in a British Telecom 
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(BT) product, telephony with the developing technology of personal computers (Tonto, 
2004). Tonto was in effect a low-power PC with communications modem inbuilt, but 
was not easy to use. A major impetus to electronic delivery in the late 1980s was the 
decision of the then Central Office of Information to adopt electronic delivery of Press 
Notices, then one of the Library’s most vital tools (Central Office of Information, 1989). 
This feed was subsequently implemented as a subsidiary POLIS database in February 
1990. 

In 1988, the Library had investigated optical disc systems for recording and 
accessing press material, sending two staff to the Parliamentary Library in Brussels, 


where that system was in use, consulting the Central Communications and. 


Telecommunications Agency (CCTA) (March 15, 1988) and visiting the Daily Mirror 
Press Archive. The problem was whether we needed facsimile access, and thus WORM 
(write once, read many times) storage. In the end, it was concluded, rightly as it 
happened, that this was an intermediate technology, as well as being very expensive to 
operate. But this decision did not finally emerge until 1994, and in the interim reliance 
was placed on Profile and on newspaper CD-ROMs (House of Commons, 1994a). The 
advent in 1990 of newspapers on searchable CD-ROM was thought a mixed blessing. 
The Head of the Public Information Office reported in 1991 that “only a very few 
research assistants can use these sources for themselves, so extra commitment of staff 
time may be called for in acquiring these ‘labour-saving’ devices” (Public Information 
Office, 1991). , 

In the Library’s reference services, there was a preponderance of requests for press 
information and articles, all of which was supplied from a cuttings system. This was in 
two parts: articles selected by the staff, indexed on visible indexes, and filed in 
envelopes, and a complete extract of the news and feature sections of The Times, 
supplied by contractors, assigned subject and biographee headings, and filed in 
drawers under them. The upshot of these two complex systems was that there was 
little or no element of self-help by Members, and their researchers had no access to the 
cuttings, which were kept in the Member-only part of the Library. Older press 
enquiries, before the advent of the visible indexing system in 1956, were met almost 
exclusively from the Indexes to The Times, use of which was a laborious and involved 
job. 

This was changing, with the advent of Profile, a charged-for online system, and in 
1992, the manual system was rationalised and cut back, and a greater acceptance of 
Profile made, complete with its charging system (Library Reference Services Section, 
1993, p. 21). However, most Members still had no direct access to this information. 
Every enquiry they put on press material was processed by the Library staff either on 
the Member’s direct request or through intermediaries such as their own employed 
researchers. 

The first time Profile, formerly World Reporter (with Textline, which eventually 
merged with it) was mentioned was in the annual report 1985/6 (Library Reference 
Services Section, 1986, p. 12). In 1988 it was reported Research Assistants “and the 
occasional Member” used the Textline terminal (Library Reference Services Section, 
1988, p. 14). They were allowed access to this as it was a block subscription (i.e. no 
charge per minute or per document retrieved). Textline was a system of referring to 
selected newspaper articles in electronic form via dial-up. 








Textline had been popular with Research Assistants and Members. It was perhaps 
the first hands-on electronic data source they could use for themselves, being 
mentioned in the House (House of Commons, 1988) and being the subject of 
controversy when its charging structure was altered and the Library sub-committee 
chose to withdraw it (House of Commons, 1990a, b). The staff of course had access to 
Profile, but Members did not. The efficacy of Profile was described by Andrew Brown 
in The Independent in December 1991, in words redolent of a pioneer: 


The delights of electronic mail [. ..] are difficult to explain to people who do not spend their 
entire working day in front of a computer screen. But for those who do, connecting to a 
network means that the information you need pops up on the screen, where you can use and 
manipulate it, rather than on bits of paper, where it can only be read and scribbled on. As an 
example, The Independent’s library has access to the Profile database of British newspapers. 
This contains every article that appeared in most national broadsheet papers from about 
1986. Stories from there can either be printed out on paper, or made accessible on our internal 


network, so the text to which I am referring appears next to what I write on my screen. From 


here I can cut and copy whatever I need and almost attain the fluency of handwriting. 
Nothing will ever make these services cheap; nor are they a wholly accurate record of what 
actually appeared in newspapers: libels are tracelessly expunged from the digital record if 
proved. But they are much cheaper than consulting a conventional newspaper library. 
.... these are services of interest really to people who are already interested in computers. 
The great difficulty that must be overcome if these networks are ever to become as universal 
as their proponents hope, is that sane people are not interested in computers (Brown, 1991). 


POLIS 

In the other main reference effort, information on the activities of Parliament, which in 
the 1950s had also been indexed by means of visible indexes, had all gone on to the 
Parliamentary online Information System (POLIS) from the early 1980s[2]. This was a 
system devised by the Library itself in the period 1975-9 to replace its visible indexes, 
which dated from the 1950s. POLIS went live in 1980. Almost no Member had any 
access to POLIS at first, whereas all Members, at least in theory, could use the visible 
indexes. In fact, only a few did so, but some at least regretted the loss of direct access. 
Nigel Spearing, in the debate in the House on the Library indexing system (one of the 
very few times, surely, that a national legislature has turned its attention to library 
retrieval systems) on January 26, 1978, had argued for the retention of a manual index 
for this reason. But he had also foreseen that computerisation might: 


... Teproduce these facilities on desks throughout the House [. . .] not only will the information 
be screened, but at the touch of a few buttons, teleprinters will spew out all one's Questions 
from last session, or whatever (Spearing, 1978, col. 1787). 


But Mr Spearing’s questions about who would have access to the computer system was 
not answered, he contended Gpearing, 1978, col. 1804), and in fact only Library staff 
had any substantial POLIS access for the next ten years. Menhennet and Wainwright 
(1982) reported: 


Members have always received a highly personal service, and are not necessarily expected to 
use the catalogues, indexes, and other secondary services themselves [. ..] Accordingly, it has 
been assumed that relatively few Members would wish to use the [POLIS] terminals 
themselves, although they can watch staff using them on their behalf (Menhennet and 
Wainwright, 1982, p. 83). 
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In 1994, POLIS MkII was to be delivered through the fledgling network 
(Parliamentary Data and Video Network, or PDVN), and it was reported that any 
Member wanting direct access to POLIS would be able to get it only through the 
PDVN: this was a means of popularising the PDVN. Limited dial-up facilities had been 
provided from the late 1980s (House of Commons, 1993a). In the run-up to POLIS MI, in 
1992-3, there were over 100 registered dial-up accounts for Members and their staff, 
and these were used mostly by staff. 

It was not until a browser-driven version of POLIS was developed in March 1999 
that the staff “monopoly” in this area was fully broken, though from the early 1990s, 
Members’ staff had been trained on the older versions. Henceforth, the elaborate search 
strategy needed for POLIS was replaced by more intuitive, and more trial and error, 
searching by the end-user via the browser. 

A failed experiment was the early presentation of Hansard electronically, by Scicon, 
the providers of POLIS MkI. As The Times reported in 1985: 


Despite much talk in political circles of the importance of information technology there seems 
little fear that MPs and others involved in Westminster life will be criticized for being 
technocrats in their everyday work. An electronic version of Hansard has had to be scrapped, 
having been described as ahead of its time. Only 80 people subscribed to the service during its 
18 months’ operation which, says the providers of the service — software house Scicon, made 
it uneconomic. According to Scicon’s manager, Bill O'Reilly, “Peopie are not yet accustomed 
to electronic delivery of this type of information when it is still available in paper form”. A 
computer-based index of parliamentary information [that is, POLIS] for the House of 
Commons library, also developed by Scicon, has 230 users and will be continuing (The Times, 
1985). 


The Library staff were in this position in 1991: what had not then been contemplated, 
owing to the twin constraints of cost (“nothing will ever make these services cheap”) 
and technology, was their universal provision to users. Indeed, most Library staff 
would have said Members had more important things to do than become computer 
geeks: “sane people are not interested in computers”. At this time, in any case, there 
were too many practical as well as technical constraints to make electronic self-service 
either viable or cost-effective. Access was therefore self-limiting, to a few selected 
locations in the case of access online information, via dedicated terminals. ` 

During the Information Committee’s investigation on the provision of a 
parliamentary data and video network (PDVN) (House of Commons, 1994b), Andrew 
Bennett MP, perhaps using the generally received wisdom of the time, asked Roger 
Evans MP (who was giving evidence) “Is it not [...] that trained staff in the Library 
might be a more effective way to access [on-line information]?” Mr Evans agreed, but 
even so, he regretted he could not have free access to the legal database, Lexis, on 
which as a lawyer he had considerable experience (House of Commons, 1994b, Q. 94). In 
a more telling phrase, the Transport and General Workers’ Union House of Commons 
Branch and Secretaries’ and Assistants Council, in a memo to the Committee on behalf 
of Members’ own staff, advocated direct access without Library intermediaries 
(“Service via the Library and/or MPs’ own staff”), partly because of the need to impart 
political slants, whereas the Library staff were professionally neutral. “We need to be 
able to extract material from information databases [...] import it onto our own 
machines, analyse it, edit it, select it” (House of Commons, 1994b, p. 66). 








This represented something of a hardly attainable aspiration at the time. In reality, 
the PDVN had been rolled out component by component, which led to slow take-up by 
Members and staff, it was pretty unreliable at first, and each service had a discrete and 
non-interchangeable user interface, which was confusing for all concerned to become 
proficient in and remember. The Library Department’s policy of equipping its own 
staff with PCs had been very influential in spreading the acceptability and inevitability 
of IT. One colleague, returning to work in late 1997 after a five year career break, 
reported that IT “has left me with the odd impression of inhabiting two parallel but 
different time zones [...] the nature of the work itself has changed little, [but] the 
methods of working have undergone something of a revolution” (Greener, 1998, p. 31). 


The Janet Levin reports l 

The Library commissioned reports from a firm of consultants, Janet Levin Associates, 
to research customer attitudes to the services it provided. They were a thorough and 
instructive exercise. The reports (in 1995 and 1999) throw some light on the situation. 
Members rated the Library’s service at the exceptionally high level of 8.9/10 in both 
surveys (Janet Levin Associates, 1995, p. 116), with comments such as “It is a jewel in 
Parliament’s crown. The service is superb. The Library is the best institution of its sort 
in Britain” (janet Levin Associates, 1995, p. 117). There were however a very few notes 
of caution: “It is very out-of-date; still uses a great deal of resources and out-of-date 
methods. Only very very slowly coming intc the computer age” (Janet Levin 
Associates, 1995, p. 117). In the 1995 survey, Levin had asked specifically how staff 
might be encouraged to use Library resources themselves. To this, low users of the 
reading rooms responded that they wanted more training on POLIS (45 per cent) and 
on the internet (40 per cent) Janet Levin Associates, 1995, p. 120). However, the asking 
of the question was in itself instructive, and arose because of the growing realisation on 
the part of the Library’s management that increasing demand had to be stemmed, that 
technology was the solution, and the remedy was as much as possible to provide 
means for users (especially but not exclusively Members’ researchers at this time) to 
serve themselves, thus avoiding saturation of facilities and the need to seek increases 
in the Library’s complement. It was this thinking that led to the deliberate policy, 
adopted in the strategic plan of 1996, of attempting to transfer searching from Library 
staff to user, a complete reversal of what had obtained five years previously. 


The Strategic Plan and afterwards 
The Department’s Strategic Plan of 1996 (House of Commons, 1996a) included the 
rather neat statement: 


The use of the PDVN for the delivery of information to Members and their staff without the 
intervention of Library staff [emphasis added] at the point of use will be [developed as quickly 
as circumstances allow] (House of Commons, 1996a, p. 3). 


In retrospect, this may be said to be the start of the Department’s avowed attempt to 
transfer information giving effort from the intermediary to the end-user. Whereas in 
the past a great deal of effort had been put into shaping the form and utility of outputs, 
relatively little effort had been devoted to considering delivery methods and 
accessibility issues. 
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Another major driver for electronicisation was the physical dispersal of Members 
and their staff across the seven buildings of the Parliamentary estate. The Janet Levin 
Associates Survey of 1995 pointed this out as one of the most serious problems. In 1995 
electronic information availability was still low, so it was not often suggested as a 
remedy. One researcher said “If I had to do that [go to the Library to look something 
up] every single time I had a query, I would just end up walking all day” (Janet Levin 
Associates, 1995, Vol. on Members’ staff, p. 21). The three terminals provided in the 
Derby Gate Library were intensively used — 54 per cent of users found they were not 
able to use a terminal when needed (Janet Levin Associates, 1995, Volume on Members’ 
staff, p. 45). 


The internet as a catalyst to change 

By 1996, Members’ staff, if not Members themselves, were becoming Aeatied to 
self-service via electronic means, and the principal of those means, it soon’ became 
apparent, was to be the internet. The adoption of the net as a finding resource was at 
first gradual, because the PDVN was a pilot scheme, and because some Members 
adopted at first their own dial-up access, through Compuserve, Demon, etc. ' 

The first mention found of the internet in the official documents of the House i isina 
paper dated June 1993[3]. 

In the mid-1990s, many web sites were of course mere “shop windows” used for 
advertising, without real information content. This began to change rapidly, possibly 
as web technology became more widespread, servers cheaper, and as users signed up. 
As in the country at large, internet penetration was rapid, unforeseen except by a few 
prescient commentators, and all-embracing. In a paper dated December 9, 1994, the 
Information Committee were told that the internet was estimated to comprise 1.7 
million host computers and 30 million users worldwide, and to provide a serviċe would 
come at the cost of £35,000 in the first year and £13,420 per annum subsequently. An 
ISP was selected. This paper was agreed on January 16, 1995[4]. The Information 
Committee were also informed on February 3, 1995: “The Internet presents a special 
case, since it is a mixture of external electronic mail and retrieval services [. . .] zt is at 
present too early to say precisely what the Houses’ usage of Internet may be” (House of 
Commons, 1995). 

Another milestone, perhaps, was that Anne Campbell MP became the first British 
Member of Parliament to have a web site (Schofield, 1995). 

On July 10, 1995, it was reported that the two Houses had registered the domain 
“parliament.uk”, and by the spring of 1996, things had progressed so far that.a group 
of officials (including the author) had reported to the Board of Management inter aha 
that the full official text of parliamentary publications be published free on the internet 
(House of Commons, 1996c). This was done later that year. It is not the purpose of this 
article to consider public dissemination of parliamentary material, but it should be 
noted that this radical departure from the received thinking of the 1980s also led to a 


diminution ‘of use of the Library, in that from adoption of the EPG report onwards, 


every internal user who wanted the text of a Bill or Select Committee Report would 
have it on her or his desktop, and no longer had to come to the Library in person to 
consult it. 








Effects of electronic availability on enquiry load 

The success of the exercise in the Library to present its outputs electronically, and that 
of the House similarly to publish its official papers can be seen reflected in the enquiry 
figures in the Derby Gate Library (see Table I), where all such enquiries from 
non-Members were directed, both in person or by phone or e-mail. 

Big falls were therefore recorded after the change of government in 1997, perhaps 
more from the huge turnaround in the population of MPs’ staff than because of 
technical advances. However, electronic provision of basic parliamentary texts was 
important, especially in 2000, as more and more of the Library’s own resources were so 
presented. The level of demand was less than half in 2003/4 than eight years 
previously. 

Usage statistics in the Members’ Library (where the clientele was MPs only) were 
collected on a consistent basis only from 1999/2000, when the system of counting 
telephone enquiries was altered. The figures given in Table II are for personal 
enquiries (i.e. not telephoned or received by letter) and are extrapolated up to 1998/9. 


Year Enquiry figures 
1994/5 36,690 
1995/6 38,962 
1996/7 37,249 
1997/8 30,8177 
1998/9 31,479 
1999/0 31,009 
2000/1 25,467" 
2001/2 22,8087 
2002/3 23,588 
2003/4 19,212 
2004/5 15,500 (est.) 


Note: *Affected by general elections, when the Library is closed 
Source: Library Reference Services Section and Public Information Office annual reports 


Year No. of personal enquiries 
1994/5 8,800 
1995/6 7,300 
1996/7 8,100 
1997/8 5,100? 
1998/9 7,329 
1999/0 6,733 
2000/1 5,472? 
2001/2 5,0317 
2002/3 5,752 
2003/4 5,386 
2004/5 4,700 (est.) 


Note: “Affected by general elections, when the Library is closed 
Source: Library Reference Services Section/RRSS annual reports 
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Table I. 


Telephone enquiries 


Table H. 
In-person enquiries 





326 


Table HI. 
Research enquiries 


Table IV. 
Electronic publication 


Another series which is instructive is that for logged enquiries by the Library’s 
Research service. These are substantial pieces of research work bespoke by Members, 
and by their staff on MPs’ behalf (see Table M). The counterbalance can be seen from 
electronic supply and usage of the Department’s own publications. 

In Table IV, the first column refers to the number of research Beret notes 
published electronically, and the second to approximate hits on them. For the most part 
these are published only internally, so the usage reflects library users’ access, not that 
of the public. Tables I-IV are mapped in graph form in Figure 1. 

In 2003/04, total electronic accesses via the intranet were: 

¢ 24,000 Research Papers; 

* 103,000 Standard Notes; 

* 227,000 Subject Pages; 

+ 24,000 What’s New from the Library (weekly bulletins); 

* 14,600 Parliamentary Current Awareness (daily bulletins); 

* 3,000 constituency profiles; 

* 8,600 Bill information pages; and 

* 4,400 debate packs; (Source: Director of Research Services, House of Commons 


Library). 

Year Logged enquiries 
1994/5 15,531 
1995/6 15,313 
1996/7 15,700 
1997/8 12,893* 
1998/9 15,017 
1999/0 14,947 
2000/1 13,3457 
2001/2 10,851° 
2002/3 11,540 


2003/4 11,261 


Note: “Affected by general elections, when the Library is closed 
Source: Research Division/Service Annual reports (unpublished) i 





No. of research standard notes published 


Year electronically Approximate hits 
1997/8 14 

1998/9 31 

1999/0 224 

2000/1 510 

2001/2 788 29,000 
2002/3 1,368 58,000 
2003/4 1,816 103,000 


Source: House of Commons Commission Annual reports (HC papers) 
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Developments of the later 1990s 

An innovation of 1997 was the Locata service, developed by the software house, 
Infernet, and the House of Commons Public Information Office. This was a system to 
identify the constituency in which an address fell, which built on technology developed 
originally by various private sector organisations in relation to the 1997 General 
Election — the first internet election. It was eventually delivered over the intranet and 
internet, and relieved the Library as a whole of some 15,000 enquiries a year — both by 
placing the solution on the desktop of the end-user, and also for the non-IT equipped, 
delivering a simpler search mechanism than the printed indexes previously in use in 
the Library itself. The system was designed to flag up areas of difficulty (e.g. where a 
postcode crossed a constituency boundary) for solution by manual means. The savings 
of staff time here were more in the Public Information Office than in the Library itself, 
but the additional electronic resource went some way to establishing the net as a 
resource of first call for Members’ staff. It was also one of the factors that persuaded 
more and more Members’ staff to acquire internet connections and expertise. 

In 1998, an important database of early day motions was also presented on the 
intranet. What previously had been the subject of a fairly complex search in POLIS, 
done only by staff, became an intuitive retrieval from the intranet. Not many years 
previously, this operation had been conducted by pasting cut-up bits of the Notice 
Paper into huge folio guard books, which had been supplanted by an extension to 
POLIS Mk II only in 1989. What in 1988 had been a search in a paper resource that 
existed only in a single copy anywhere had become by 1998 a search that could be 
conducted on any internet-connected PC in the world. 

Again in 1998, the Library’s published papers (Research Papers) were made 


available via the internet. In part, this was a response to pressures on staff time in . 


despatching these documents to an ever-growing mailing list — not, this time of 
Members, who simply picked up their copies in the Library, but outside collaborators, 
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Figure 1. 

Effects of electronic 
availability on 
enquiry loads 
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researchers, and other parliamentary libraries. The burden of despatching hard copy 
had become oppressive, so the solution adopted was to present the documents on the 
internet and allow people to use them at will. In fact, the decision was a significant 
milestone in making information about current parliamentary activity better known on 
a worldwide basis, such that the texts received 420,000 hits in the first year of reliable 
statistics, 2001/2 (730,000 in the latest available year, 2003/4)[5]. 

Electronicisation of the delivery of press material was still being considered from a 
domestic point of view. It was thought that Lexis-Nexis was too broad-brush for the 
Library, and especially its researchers, who needed to be able to isolate articles with 
significant content to the exclusion of less important material. It was also thought to be 
much too expensive. The result of this need was the Press Comment Database, a 
home-grown system, with material being selected by the practitioners in the Reference 
and Reader Services and International Affairs and Defence Sections. It is perhaps 
arguable that before an expensive home-grown system was developed, different ways of 
using Lexis-Nexis (or FT Profile) should have been considered. However,-the Press 
Comment Database was inaugurated in 1998, with access via the intranet. In RRSS, it had 
a team of an editor plus three spending some 50 per cent of their time inputting material. 


Fine tuning and later developments 

The late 1990s presented a favourable climate for the consolidation of the advances 
made earlier. The PDVN stabilised, and provided reliable and fast access to the 
internet. Though there were complaints about its reliability, remote secure access was 
also provided via Citrix dial-up. Standardisation on Microsoft products led to common 
user interfaces and more intuitive searching for all users without the need for special 
training in any one, or the need for regular use to achieve good results. Older products, 
like POLIS, were given browser interfaces, and the massive growth in home PCs 
familiarised many more users, including members, with the basics of IT. Henceforth, 
they could say clearly what they expected, rather than being offered something by the 
Library that the information professionals thought useful. At the same time, web sites 
began to be less advertising shop windows, and became more content-rich, a trend 
encouraged by the Government in its own web sites and by e-champions in 
government departments. 

Perhaps the seminal event in changing Members’ use of the reference services of the 
Library for press material was undoubtedly the advent of Lexis-Nexis on the intranet, 
which occurred in February 2002. There had been an experiment, involving 21 
Members, allowing direct access to Profile, in 1993, but this was not made permanent, 
as costs were reckoned to be likely to reach £66,000 per annum for unrestricted access 
for all MPs (House of Commons, 1993b). Insufficient Members were interested in using 
Profile for themselves and they did not use it enough to become familiar with the quite 
complex search syntax. In 2001/2, the Library and the proprietors of the database 
(which had just been renamed Lexis-Nexis from FT-Profile) negotiated a block 
subscription — that is, a single payment for unlimited use by those connected to the 
Parliamentary Network. What previously had been a scarce tool, the use of which was 
protected jealously and the cost of which was largely determined by the skill and 
economy of the library staff, now became a common resource on all parliamentary PCs, 
in which the end-user could enjoy free warren. It was the philosophy of the’ internet 
brought to a previously strictly controlled web service. It also spelled the end of the 








Press Comment database, at least for non-international enquiries. By the middle of 
2002, although usage figures were running at about 9,000 per month, its usage for 
in-house reference purposes had so declined that proposals were made to abandon its 
coverage of home (press) articles. This occurred in March 2004, though it continued as 
a research aid for international enquiries. In fact, the elimination of unwanted material 
from Lexis-Nexis has been much less of a problem than had been anticipated, and then 
design of the end users’ search screen for that system allowed searches to be restricted 
in a way that made the job easier. 

Another step in this direction happened in November 2002, when the House of Lords 
Library took out a subscription to the Times Digital Archive (TDA), in which Gale 
Research had digitised The Times newspaper 1785-1985. Only 1921 to 1968 was available 
at the start, but the whole 201 years came on tap by early 2004. The TDA revolutionised 
access to historical press material, and again, delivered it at the point of end-use, thus 
obviating the need for most MPs to contact the Library for such material as obituaries, 
constituency political information, biographical material, or the “when did Lloyd George 
do ...” type of enquiry. Access to the TA was limited to a certain number of concurrent 
searches, but in fact that had seldom been a problem. The TDA has also provided a 
back-dcor approach to searching for historical parliamentary material, since the gallery 
reporting of the newspaper was so good, especially in the late nineteenth century. 

The Parliament that took office after the 2001 election was the first in which a 
familiarity with IT in a majority of its members could be assumed. The House also took 
the initiative in supplying a standard provision of well-specified IT equipment to 
Members. In the expectation of much greater self-service, the Department’s pages on 
the intranet were recast in 2002 so as to provide a subject-based approach. A user is 
offered the choice of subject headings, linked by taxonomy, and then has the option of 
viewing online the Library’s published output on the subject, and/or of contacting the 
subject specialist who deals with the matter. 

Another research tool made available electronically on the intranet rather than on a 
bespoke basis was that of constituency profiles. This enables the selection of a 
constituency name and the immediate provision of a selected raft of information 
imported from other databases to standard format, including population, election 
result, wealth and poverty indicators, unemployment, etc. A great deal of effort was 
applied te providing commonly asked for lists, such as in POLIS, real-time lists of 
Green and White Papers, command papers, etc., the ranked list of numbers of 
Questions asked, and voting participation rates, via the subject pages of the intranet. 
All of these reduced the calls on staff time and the need for staff intervention, and the 
subject pages are so structured that connected material, for instance in the Press 
through Lexis-Nexis, or in linked web sites, is easily available. 

In 2000, the new parliamentary building, Portcullis House, was opened. The 
Department here opened what was called the e-library, a training area for IT, plus a 
bank of eight PCs in cybercafe style, which could be signed on with individual network 
passwords, or generically, without the need for a password at all. These became very 
popular, not so much with Members or their staff, but with staff of the House, 
especially during breaks. However, they did popularise electronic searching — mostly, 
of course, for recreational purposes and e-mail — and contributed much to the 
electronic slant of the Department’s work. In 2004, this resource was converted into an 
enquiry point, and (for security reasons) generic sign-on was abolished. 
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Traditional means of receiving enquiries were maintained. Not all Members were 
users of technology, but as time went by, more and more became so, and training was 
offered to them as well as to their staff. The Members’ Library suite, because of its 
proximity (about 50 yards) to the Chamber, is a popular place for Members to work. 
The author reported, as Head cf Reference and Reader Services, in April 2001: 


The decline in enquiry load [of approximately 5 per cent] noted is not surprising. John 
Ackroyd, writing about the future of academic libraries (ASLIB Proceedings, 35,:3, March 
2001, pp. 79-84) says “usage is becoming more distributed and more screen-based, as a 


consequence of electronic delivery of material [...] Libraries have ceased to be the first resort - 


for users and are becoming a complementary source to off-campus use, distributed and net 
based services ....” Though the HCL is not an academic library, it shares some of their 
characteristics. In the case of RRSS, the availability of material put in electronic form by the 
Library or House is one side of the equation, but another is the growing tendency, amongst 
both Members and their staff, to work from their offices, especially given the enhanced 
facilities in Portcullis House. This tendency will only increase, as older, less technically 
competent Members and assistants retire or move on, and as accommodation is upgraded 
(House of Commons, 2001, p. 4). 


Conclusion 

Information centres such as the House of Commons Library did not face much change 
in old certainties before the mid-1990s. In the past, the enquiries came in, the numbers 
dependent mostly on factors of current interest. Whatever new topic arose, information 
practitioners would confront the challenge, discover, and perhaps even research and 
publish, new sources to answer the demands put upon them. 

The advent of electronic sources has empowered the end-user in just the same way 
the invention of printing did 509 years ago. The sages and information rich priests and 
clerks, lawyers and scribes of those times saw their role change, as books became more 
widespread and cheaper. It is not, perhaps, surprising that the Reformation and the 
printing revolution were more or less coeval. And books were just as unstructured — 
without pagination, chapter headings, contents lists, etc., in the fifteenth century, as the 
internet often is now. Another analogy one might apply is the transformation of 
retailing in the 1950s from counter-based shops where an assistant ran around to select 
the customer’s desired goods to the supermarket, where the customer made his/her 
own selection in his own time. 


The New Reformation has come to the House of Commons Library. The Library, 


with its 15-foot high shelves, has never been physically suitable for browsing, and for 
the most part, Members and other users do not “graze” the book stock. But as electronic 
sources become more available, and Members and their staff become either younger or 
more technically competent, or both, grazing in electronic sources has become and will 
in the future become even more common. 

The Commons Library’s policy had always realised the nee were the number of 
intermediaries between user and information the better. That is why for preference a 
Member was always referred direct to a member of the research service staff rather 
than the enquiry being taken down in writing and passed on. Now we have a situation 
fast developing where the traditional need for the old skills of the intermediaries is 
being modified or cut out by the Member’s modem. The latest estimates imply that 





some 40 million people in the UK will have access to the internet by the end of 2005. 
This will mean that skills of reference practitioners will have to change. 

But equally, there will be material that does not lend itself to online exploitation. 
Library staff knowledge of historical and archival sources will be more called upon, 
and that of parliamentary practice and procedure. They will be called upcn more to 
evaluate and amplify information the Member or user has him/herself found, rather 
than deriving it in the first place. We may need to redeploy staff so as to be information 
facilitators and interpreters rather than information seekers. Already Library staff act 
as on the spot IT advisors: this is a role we may have to develop. The author was 
reminded of the dichotomy one day when, having been approached by a Member with 
difficulty in connecting a laptop, he next wanted to use a dip pen. 

The process described above is, of course, incomplete and ongoing. The next large 
project is the Parliamentary Information Management System, PIMS, the replacement 
for POLIS, which will be a huge data and information/knowledge management 
resource for Parliament; this will supplant POLIS in April 2005, and complete the 
transformation from the reference-based POLIS in which an awkward transposition 
was necessary to find the full text, to a fully searchable knowledge repository. Other 
projecis ongoing include the digitisation of Hansard for 1803-1988, the extension of the 
Electronic Parliamentary Community to the Library, for the electronic exchange of 
information between government and parliament. And although the Library has 
access to commercial databases of statute law, the long-promised Statute Law 
Database could fill a significant void. 

The first decade of the twenty-first century is an exciting time for information folk 
to live in. So long as we are flexible and adaptable, alert to changing needs, and not 
hidebound in our approach, we can serve users ir: the new scheme of things, selecting, 
distilling, interpreting and evaluating information, as well as we did in the old. 


Notes 


1. Library services to non-Members were the responsibility of the Public Information Office 
until 1995 and those relating to identification of constituency (5000) were dealt with in the 
Public Information Office itself. 


. For a comprehensive account of the early development of POLIS, see Menhennet and 
Wainwright (1982). 


3. INF/137 unpublished June 1993 but later published in HC 237 1993/94) — where the Library 
listed e-mail/Internet as desirable services to be provided by the PDVN. 


4, Pilot Scheme for access to the Internet, INF/284, unpublished. 
. House of Commons Commission Annual Reports. 


np 


on 


Unpublished material is in the Library’s own collection, and would be available to students 
at the Parliamentary Archives (HLRO). “INF/” indicates papers presented to the Information 
Committee. 
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Abstract 

Purpose — To describe the initiatives of the Scottish Parliament in the field of e-democracy and 
assess the prospects for future developments. 

Design/methodology/approach — Analysis and review. 

Findings — The Scottish Parliament has always seen the internet as one of the major mechanisms for 
engaging Scottish citizens in the Parliament’s business and activities. Its most successful initiatives 
have been the e-petitioning system, the webcasting of proceedings, the discussion forums and the MSP 
video diaries. 

Research limitations/implications — Relevant to parliaments and other representative 
institutions. 

Practical implications — Simple implementable tools are described that have been shown to be 
effective. 

Originality/value — Few parliaments have been able to put theory into practice in a short time. 
Applicable to other small parliaments with limited resources wishing to enhance democratic 
participation by electronic means. 

Keywords Democracy, Parliament, Scotland, Citizen participation, Internet 


Paper type Case study 


The fact that the Scottish Parliament was established well into the age of the internet 
has given it a huge advantage in terms of incorporating new information and 
communication technologies into its practices and procedures. Although the Parliament 
was established as a result of the Scotland Act, passed by the UK Parliament in 1998, it 
was able to be created in an astonishingly short time because of the wealth of discussion 
and cross-party agreement that had taken place over many of the preceding years. 
Much of this prior planning was encapsulated in the work of the Consultative Steering 
Group (CSG), which was set up by the Secretary of State for Scotland in November 1997 
under the chairmanship of Henry McLeish MP. Their report, Shaping Scotland’s 
Parliament, was published in December 1998 and made detailed recommendations for 
how the Parliament should actually work (Scottish Office, 1998a). 

The CSG identified four key principles on which the Parliament’s operations should 
be based. These were endorsed by the Parliament in 1999, and are: 


* sharing the power; 

* accountability; 

* access and participation; and 
* equal opportunities. 


Scottish 
Parliament and 
e-democracy 


333 


Received 17 February 2005 
Accepted 28 February 2005 


Emerald 


Aslib Proceedings: New Information 
Perspectives 

Vol. 57 No. 4, 2005 

p. 333-337 


p] 
The CSG was supported by an Expert Panel on information and communication © Emerald Group Publishing Limited 


technologies, whose remit was: 


DOI 10.1108/00012530510612068 





334 


.. to provide advice on how the Scottish Parliament might use technology to: 
* promote internal efficiency and innovative ways of working; 


+ provide information about its proceedings and its work to the widest possible audience 
in the most accessible way; 


+ making it as easy as possible for the Parliament and individual MSPs to exchange 
information with external organisations and the public; and 


« encourage democratic participation and involvement (Scottish Office, 1998b, pp. 87-8). 


The Expert Panel’s report (Scottish Office, 1998a) made detailed recommendations that 
would promote both parliamentary efficiency and openness, accountability and 
democratic participation, many of which were implemented in the Parliament’s early 
years. 

The first elections to the new Parliament were held in May 1999 and by polling day 
a fully functioning web site had been launched (see www.scottish.parliament.uk) with 
a commitment to make all of the Parliament’s proceedings available to the public 
electronically as well as in printed form. Members of the Scottish Parliament (MSPs) 
were supplied with standard IT equipment for use both at the Parliament 
Headquarters in Edinburgh and in their local offices. The creation of a standard IT 
network from the outset, to which everyone was connected with standard software, 
meant that the transfer of information was much easier than having to deal with a 
variety of different hardware and software installations. This enabled the Parliament 
to allocate to each MSP an individual e-mail address with a standard format (e.g. 
joe.bloggs.msp@scottish.parliament.uk). There was some discussion at the planning 
stage about the choice of domain name. An early assertion of the Parliament's 
independent view of itself was evident in its rejection of a “gov” domain in favour of 
the albeit subordinate looking “parliament” domain. 

One of the disappointing aspects of early attempts to establish e- -democracy 
initiatives was the relative poverty of the telecommunications infrastructure in 
Scotland generally. Five years later this picture is much improved, but Scotland still 
shares with some other places difficulties associated with poor connectivity and 
downloading large quantities of data. 

From the outset, the Parliament established a network of Partner Libraries. Thisi isa 
network of 80 public libraries spread geographically throughout Scotland. It began 
with one in each constituency but was expanded to include more libraries in remote 
areas. The successful rollout of the People’s Network in Scotland has now achieved all 
563 public libraries having free internet access. The Partner Libraries are offered 
printed copies of the Parliament’s publications and staff are trained and supported in 
answering enquiries from the public about the Parliament and its publications. These 
libraries offer opportunities as venues for activities associated with the Parliament, and 
indeed some have been chosen as the location of MSPs’ constituency surgeries. The 
Parliament’s Outreach Service recently held a successful video-conferencing event with 
local community representatives in a library in Broughty Ferry in Tayside, who spoke 
directly to their constituency MSP at the Parliament in Edinburgh. This was 
enthusiastically received by the participants and is likely to be repeated'in other 
Partner Libraries as facilities and opportunities arise. 








Soon after the 1999 election it was possible to put biographical information about all 
new elected Members on the Web site together with their e-mail addresses and contact 
details. Naturally, as the first Parliament in Scotland for 300 years, this attracted a 
great deal of attention, and has set the benchmark for the improvement of the quality 
and interactivity of this type of information in the future. When the Parliament 
revamped its web site to coincide with its migration to the new Holyrood building, 
short biographical films of all MSPs were included. These went live in September 2004, 
making the Scottish Parliament the first democratic institution in the world to offer 
such a service to its citizens. One of the notable successes of this particular venture was 
the ability to attract the support of the Members for participation in this project, but 
also the dovetailing of biographical information, which is provided as text both online 
and in print, with a much more engaging and entertaining video snapshot of Members 
themselves. Many Members have developed their own web sites, with varying degrees 
of interactivity. The Parliament’s web pages include links to MSPs’ personal web sites, 
but their contribution to e-democracy is not considered here. 

Webcasting was another service the Parliament was able to introduce at an early 
stage. The Scottish Parliament was the first Parliament in the world, it is believed, to 
offer such comprehensive webcast access to its proceedings. The opening ceremony in 
July 1999 was broadcast to the world, as were the Parliament's sittings in Glasgow in 
May 2000. From September 2000 this service became a permanent feature, and has 
steadily developed since then. Since the Parliament’s recent move to the Holyrood 
building, all public parliamentary business, including all Committees, have been 
broadcast live. All live sessions are archived for a minimum of one month and a 
permanent archive of major parliamentary events has also been maintained. 

Making pictures and documents available electronically and facilitating members of 
the public contacting both the Parliament and individual MSPs by e-mail is of course 
important. However, it can only provide the basic foundation for real engagement. 
More challenging is the objective of genuine interactivity between elected 
representatives and their constituents. The purpose of this activity is to enhance the 
democratic process. It is fruitless to embrace new technology for its own sake; it can 
only engage the citizen in the democratic process if it is integrated into a genuine 
dialogue between electors and the elected. The Scottish Parliament has tried to achieve 
this objective in two different ways. 

First, the Parliament pioneered the use of interactive forums to support a discussion 
relating to an item of forthcoming Members’ business. These are short debates initiated 
by individual MSPs and held at the end of the parliamentary day. Since September 
2002 a total of 26 such forums have been hosted, some of which have been remarkably 
successful. Participation levels in these online forums have varied greatly from as few 
as ten posts to over 400 in the case of a forum on wind farms. In part this reflects the 
differing levels of interest in the subject under discussion and is partly due to varying 
levels of promotion. One of the most successful Members’ business forums concerned 
the subject of chronic pain. Many sufferers were able to present their own experiences 
in time for the Member who initiated the debate to incorporate this feedback into her 
speech (Scottish Parhament Official Report, 2002). Many participants were thus able to 
see how their contribution could form part of the parliamentary process. 

Interactive forums or bulletin boards have also been run in conjunction with some 
committee inquiries. For example, the Education Committee conducted one in March 
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2002 as part of its inquiry into the purposes of education, and the Enterprise Committee 
held one in June 2002 during their tourism inquiry. One recent success in this area was 
an online questionnaire used during the inquiry by the Finance Committee into the 
relocation of public sector jobs, including the proposed relocation of the Scottish 
Natural Heritage agency, which attracted around 2,250 signatures. This success was 
attributed in part to the fact that they guaranteed anonymity and also because their 
objective was to reach directly the affected individuals. The evidence gathered by this 
online questionnaire was backed up by the other evidence given to the Committee by 
organised interests, but it was given more weight because of its individual nature, and 
the Committee felt that it had been a worthwhile exercise. 

.The second and most successful demonstration of the Scottish Parliament’s 
e-democracy credentials are evident in the success of the e-petitioning system. The 
Parliament’s commitment to access and participation led directly to the establishment 
of a system for dealing with petitions from individual citizens (a single signature is 
enough) and of ensuring that the issues are examined within the Parliament in a 
meaningful way. From the outset it was possible to petition the Parliament by sending 
an e-mail through the contact given on the web site. The primary requirement for 
admissibility is that the petition must request the Parliament to do something that it 
has the power to do. The Public Petitions Committee accepted its first e-petition on 
March 14, 2000, the first statutory body to formally accept e-petitions (McMahon, 
2004). This system was developed in partnership with the International 
Teledemocracy Centre at Napier University and was further developed and formally 
launched in February 2004. 

The e-petitioner system allows a petitioner to, gather signatures and to develop a 
discussion about the topic before the petition is formally lodged with the Parliament. 
Each e-petition has its own discussion forum where visitors and signatories can 
discuss the issues online. Supporting information can also be added so that the issue 
can be viewed in context. Once the agreed period for hosting the petition online has 
expired, the petition is formally submitted to the Public Petitions Committee. The 
International Teledemocracy Centre presents a report to the Committee reviewing the 
extent of the online support and a summary of the online debate. In addition, 
petitioners can track the progress of their petition through its life in the Parliament. 
Once the Public Petitions Committee has considered a petition it may investigate the 
issues itself or refer the petition to a more appropriate subject committee for 
consideration. l 

E-petitions have attracted signatures from many other countries and only the 
signatory’s name and country appear on the site so that these can be identified. There 
is no requirement for a Member to sponsor a petition, so they are genuinely expressing 
the concerns of individuals or groups. They also overcome barriers of time and 
distance, and have encouraged participation in real politics by people who might 
otherwise have felt that there was no opportunity to participate. 

The Public Petitions Committee has embarked on a programme of events to 
promote the petitions system, particularly in relation to groups traditionally 
marginalised from the political process. They plan to hold one event in each of the 
eight Scottish Parliamentary regions, and each event includes a presentation on 
e-petitioning. Members of the Public Petitions Committee and other MSPs support the 
e-petitioner system with great enthusiasm because they can see that it succeeds in 








addressing constituents’ concerns as part of a genuine democratic process. Not every 
petitioner is satisfied by the outcome of their petition, but people appreciate that it has 
been given serious consideration. 

The Scottish Parliament has an explicit commitment to encourage the engagement 
of the Scottish people in the Parliament’s business. As a new Parliament established on 
the eve of the millennium it prides itself on its ability to innovate, but to do so 
cautiously and only when resources and opportunities permit. In reviewing the 
effectiveness of the measures implemented to date, staff will take into account the 
appropriateness of different tools in different circumstances. Where using electronic 
technology can enhance the democratic involvement of the electors with the elected, the 
Parliament will continue to try new developments. Where Members of the Parliament 
can feel that these developments improve the quality of democratic representation they 
will support them. The role of staff is to identify and choose the most appropriate 
methods that will deliver the benefits that Members expect and deserve. 
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Abstract 
Purpose — To examine the weblogs written by members of UK legislatures and to determine whether 
such weblogs address commonly cited criticisms of MPs’ web sites and serve to bridge the gap 
between representative and constituent. 
Design/methodology/approach — Examination of the literature on MPs’ web sites to draw up a list 
of common criticisms. Construction of evaluation criteria to analyse the blogs in terms of content, 
currency, design, interactivity and evidence of personality both as a snapshot and over a longer period. 
Findings ~ That weblogs are, on the whole, kept up to date and show promising levels of activity. 
Blogs enable constituents to see with what their MPs have been involved (on both the local and the 
Parliamentary stages) and to see what areas of policy particularly interest their MP. Personality of the 
MPs is apparent on most of the blogs, which are less party-oriented than many MPs’ web sites. 
Although the gap between representatives and constituents may have teen bridged to an extent, 
blogging is still largely a top-down form of communication — even though people do submit relevant 
and pertinent comments to the blogs, proper two-way debate is rarely seen and comments are not 
always acknowledged or answered. 
Research limitations/implications — Based on a small number of blogs covering the UK only. 
Practical implications — Provides simple evaluation criteria that could be applied to blogs in other 
areas. ; 
Originality/value — Provides a useful first structured analysis of weblogs written by elected 
representatives, on which further work can be undertaken once the sample size has increased and 
existing blogs are more established. 
Keywords Internet, Worldwide web, Politics 


Paper type General review 


Introduction 

Use of the internet by political players is now a well documented and researched topic. 
Studies have covered use of the web by MPs, political parties, trade unions, 
government departments and online activists. However, the net is not a static 
phenomenon and as new internet ideas come to the fore, there is renewed impetus to 
re-examine the subject. Weblogs are a relatively recent development in the online 
political arena but are certainly worthy of further research. 
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the analysis, though this inevitably means there is no party balance and the sample 
size is too small to draw definitive conclusions. In terms of scope, all of the sample are 
Westminster MPs with the exception of Peter Black, Assembly Member for South 
Wales West. 


Definitions and history 
There are some technical definitions available on the net: 


Weblogs provide a series of annotated links to items such as news stories, and often include 
personal rants. They are maintained by one person, most commonly someone who is involved 
in Web design or some other tech-related field. Weblogs are usually updated on a daily basis 
and always reflect the style and attitude of the author (Battey, 1999). 


This might have been true in the embryonic days of blogging, but now blogs are less 
the preserve of technical staff and are more mzinstream. A better and more open 
definition is offered by The Guardian: 


A weblog is, literally, a “log” of the web — a diary-style site, in which the author (a weblogger, 
or “blogger”) links to other web pages he or she finds interesting (Perrone, 2004). 


In terms of definitions in the literature, Blood is on the ball: 

... a weblog is a coffeehouse conversation in text, with references as required (Blood, 2002, p. 1). 
As Blood states and most commentators agree: 

... weblogs are hard to describe but easy to recognise (Blood, 2002, p. 1). 


Jorg Barger takes the credit for coining the term “weblog” back in 1997. His 
robotwisdom web site was his passion for two solid years and covered his daily 
thoughts on a wide variety of issues. However, in reality blogging has been around in 
some shape or form for much longer. Tim Berners Lee, the father of the internet, listed 
new web sites on a dynamic web page at CERN (http://info.cern.ch/) in the early days of 
the net. In fact, in this formative period, most homepages consisted of favourite links 
with a little additional information. Arguably, a modern day blog consists of a link to 
an item of news with some personal commentary or observations. 

So, Barger invented the term weblog but blogs have come a long way in only a few 
years. How did they make the transition from friend of the techie to tool of the elected 
representative? As with most internet fads, most of the early developments occurred in 
the US and have been mirrored across the world. Back in 2000 The New York Times 
was speculating that blogs would be “the next big thing” as everyone has something of 
the amateur publisher and newshound about them: 


The concept is simple enough. Create a Web page. Update it regularly with brief personal 
reflections or witty commentary, sprinkled with links to other pages. Put new entries at the 
top of the page, pushing older ones down. Voila, you've got yourself a weblog. 

That may not sound like the recipe for a social movement. But in the past two years, 
thousands of people have started their own weblogs, creating a vast sprawl of sites that, to 
the uninitiated, might feel like a parallel] Web universe (Gallagher, 2000). 


A cursory glance at Web sites that list or catalogue blogs shows that blogging has 
invaded every possible area of web-life, and however obscure the subject, someone 
maintains a blog on it. A few examples, from the sublime to the ridiculous: 
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* war (www.blogsofwar.com); 

* gambling (www.gamblog.co.uk); 

* corporate law gossip (www.corplawblog.com); 

* Norwegians living in the Netherlands (www.vinje.demon.nl/); and 


* the unofficial blog for Barney, President Bush’s dog (presumably this is a proxy 
blog! — www.blogcap.com/barney/). 


Blogs are so commonplace now that an episode of the hit TV series The West Wing 
centred on a story that breaks first on a newsblog before being picked up by the 
mainstream press (aired November 19 in the US and reviewed at http://westwing. 
bewarne.com/sixth/605hubbert.html). 

In terms of political blogs, Tom Watson MP was the first to jump on the impending 
bandwagon. His blog began on March 6, 2003 as part of a restructured web site. His 
introductory blog stated: 


This is the first entry in the new home of Tom Watson. 

There’s not much to see here right now, as all previous activity has been at 
tomatwestbrom.com 

We're going to be busy for a while ... designing the new site (I have a strange feeling that 
the display width will be one of the higher priorities) and importing the archives from the old 
site, but if you want you can sit back and watch the whole thing come together. 

Pretty cool, huh? 


It received eight comments (mainly from friends testing the site) and was reviewed in 
The Guardian: 


West Bromwich MP Tom Watson has become the first UK politician to publish his own 
weblog. 

The Labour backbencher’s new site went live this week, making him the first MP to 
publish an online diary of his working life. 

Although many MPs have standard websites, there are none who update theirs on a daily 
basis, recording their thoughts and actions in a diary style (Tempest, 2003). 


When interviewed by The Guardian, Mr Watson MP stated his motivations:. 


It’s a political risk, baring one’s sou! in public every day, but I want to share the dilemmas of 
representing people. 

You've got to be frank, otherwise the concept doesn’t work, so no doubt some young 
graduate in Conservative central office will be scouring the site for quotes to trip me up on, 
but I don’t care. 

The blog allows me the chance to speak freely and rebut any claims that are made. That’s 
why there’s no point having a researcher or somebody write it, it has to come from the heart, 
and it has to be daily (Tempest, 2003). 


So for Tom Watson, regular entries and personality are the keys. 

Since then, a small number of MPs have followed suit. During the interview 
with The Guardian, Watson claimed “At the moment I believe I’m the first 
Westminster MP to have a blog, but I believe it will be the main method of 
communication for politicians within ten years”. He may well be correct, but what 
about right now? 











Literature review 
The purpose of this literature review is two-fold: 


(1) to introduce some general works on weblogs and blogging; and 
(2) to examine the literature on MPs’ efforts cn the net thus far. 


Sadly, very little has been written specifically about blogging politicians. There are, 
however, a few general works on blogs that provide useful background reading on the 
topic, often with definitions. One such is The Weblog Handbook: Practical Advice on 
Creating and Maintaining Your Blog by Rebecca Blood (2002). Written as a guide to 
creating a blog, the first two chapters contain a useful history of blogs and reasons to 
blog, Blood covers the early days on the net when homepages consisted of lists of 
links/bookmark files and notes on how the net has gone full circle and weblogs are 
largely a reincarnation of these lists — a move away from the corporate and commercial 
middle ages of the web. To her, Mosaic’s “What’s new”? page was ostensibly the first 
blog. Thus, she also disputes that Barger had the first blog, though she does agree that 
he deserves credit for coining the term. Blood also describes how blogs moved away 
from the domain of the techie and became more universal: the development of simple 
blogging software like Blogger ensured that this could happen. 

There is an interesting section on newsblogs. Usually commentators have 
suggested that newsblogs are a new and different form of journalism that bring out the 
amateur news hack in everyone, but Blood disagrees. She argues that newsblogs 
supplement and work alongside traditional news media by filtering and reporting 
news. She praises the growth of personal diaries and eyewitness accounts (such as 
those that sprung up after 9/11) but overall sees blogs as being a new way of 
distributing and collecting the news rather than a new source of news itself. 

Overall, Blood’s book is most appreciated for the definitions it contains: it is great 
for clarifying why blogs are different to online journals and more than just online 
diaries. It also clearly establishes how blogs are different to “normal” web pages and 
what we should look for in a blog. For Blood, “the appeal of each weblog is grounded 
thoroughly in the personality of its writer: his interests, his opinions, and his personal 
mix of links and commentary” (Blood, 2002, p. 6), and “the weblog, updated regularly, 
is designed to be visited again and again” (Blood, 2002, p. 9). 

The Political Mapping of Cyberspace (Crampton, 2003) is a very philosophical look at 
cyberspace in general with many references to Foucault and Heidegger, but he does 
make a few pertinent suggestions about blogging, particularly to do with bloggers 
forming communities and blogging as a form of resistance: “some blogs are 
deliberately oppositional, designed to promote political discourse which bypasses the 
traditional media outlets in a more democratic manner” (Crampton, 2003, p. 106). 
Crampton makes no overt reference to politicians in this context, choosing instead to 
concentrate on newsblogs. A missed opportunity — if only political blogs had received 
as much coverage in the literature of newsblogs. 

Although little has been written about political blogs, much has been written about 
MPs on the internet. A quick run through some of these articles shows an uncanny 
amount of common ground in terms of common criticisms, particularly in terms of the 
lack of true interaction and the sterile/dry nature of pages that rely heavily on 
party-produced copy. 
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The Institute of Economic Affairs undertook an analysis of MPs’ web sites in 2000 
and branded them “inept”, “flaccid” and “bland”. It continued: 


The researchers say they-uncovered an array of poorly conceived pages that conveyed little of 
an MP’s personality and failed to inform constituents about what their MP thought and did. 

The Institute said MPs are wasting a real opportunity to engage with constituents, show 
them where they stand on important local and national issues and bring government closer to 
them (Ward, 2000). 


Likewise, a Hansard Society report on MPs on the web found similar problems: 


If politicians want to engage those who are switched off by conventional politics, they will 
need to be imaginative in their use of the new medium at their disposal (Coleman, 2001, p. 4). 


The Hansard Society commissioned MORI to poll the public about MPs’ web sites. 
Almost 2000 adults were questioned at 194 sampling points. The sample included 
people who were already au fait with internet technology (i.e. with home access) and 
those who were not. 

When offered a choice of online innovations and asked to pick which would be the 
most useful to them, the results were: 


An online surgery so that constituents can raise problems with him/her via the Internet (39 
per cent) 

A consultation forum where he/she can read constituents’ views (22 per cent) 

A Web site containing their daily diary (10 per cent) 

e-mail updates sent to constituents by him/her on matters of importance (15 per ‘cent) 

14 per cent of the sample thought MPs having an interactive Web site was the thing they 
most wanted to see over the next five years (Coleman, 2001, p.5). 


In conclusion: 


It is clear from the preferences expressed that the public is less interested in being recipients 
of politician-generated information and more enthusiastic for interactive opportunities 
(Coleman, 2001, p. 5). 


The report also makes a key point about interaction: “interactivity is not in 
essence a technological issue, it is more about the culture of democracy, and 
specifically the relationship between representatives and the represented” (Coleman, 
2001, p. 7). This is certainly true for. weblogs that require little technological 
know-how. 

Gary Halstead’s article “MPs: cyber-men or cyber-shy” (Halstead, 2002) reinforced 
the Hansard Society’s findings. Halstead tried to measure the interactivity of MPs’ web 
sites using similar criteria to the Institute of Economic Affairs in their analysis. 
Halstead also questioned a sample of MPs about the motivations behind their web sites 
and asked them to rate how important its different functions were, eg. as a 
campaigning tool, to keep in touch with constituents, to foster interaction, etc. He 
concluded: 


MPs must effectively embrace technology to re-connect with their constituents or risk 
becoming ever more remote (Halstead, 2002, p. 385). 


Yet another survey of MPs’ web sites (Westminster Watch, n.d.) provided more 
evidence of poor quality web sites and lack of interactive capabilities: 








The survey has found that only 82 out of 658 MPs have their own website, and of those, most 
are of a pretty amateurish standard. 

Sadly, the majority of sites are slow to load, fail to encourage interactive communication 
with constituents. 

Many MPs seem to see their websites as on-line election leaflets, rather than as an 
opportunity to use technology to have a two-way communication with their constituents 
(Westminster Watch, n.d). 


Interestingly, criticism is not just restricted to UK MPs, as this Canadian article 
indicates: 
The survey also indicates that just 27 percent of MPs’ websites use interactive tools such as 


online feedback forms or surveys which would allow citizens to express views directly to 
their local MP (Centre for Collaborative Government, 2002). 


A recent article in the New Statesman suggested the true reason for the tokenistic 
states of MPs’ web sites: 


Thompson (of Pipex) is particularly sanguine about what has been achieved. “For an MP 
having a website is now just a ‘hygiene factor” he says. “Not having one is embarrassing. But 
it makes little impact on citizens in the constituency, and there is no evidence at all that it 
helps them win votes in elections” — you're criticised for not having a website but it serves a 
limited purpose (Crabtree, 2004, p. xi). 


Continuing the good work, the Hansard Society filled a crucial gap in the literature 
with its “first impressions” analysis published in July 2004 (Ferguson, 2004). This is 
essential reading for anyone interested in blogs in the political sphere, and emanates 
from a joint project with the Hansard Society and the All Party Parliamentary Group 
on E-democracy (APPG). What started with a debate in Portcullis House about political 
blogs culminated in this fine study. Adopting a qualitative approach, the study 
evaluated political blogs in various guises, not just those of elected officials but also 
included a prospective parliamentary candidate (PPC), a political journalist, a think 
tank, a campaign group and an overseas blog. The sample size was eight and the blogs 
were monitored over two week-long periods. The evaluation criteria covered the level 
of activity on the blogs, both from the bloggers and their readers. Blogs were divided 
into four categories: fact, opinion, experience and questioning, and the results were 
used to define whether the blogs were more “soapbox” than “forum”. The Hansard 
Society also assembled a jury of eight to monitor the blogs over a four-week period. 
These eight had generally low levels of political participation but they were required to 
review each blog every week for a month. Asking the public (i.e. the target audience of 
blogs) what they think is an excellent way of judging their success. 

The conclusions drawn make interesting reading. Although these blogs were not 
necessarily well received and were criticised for lacking interaction and occasionally 
being “boring”, the jurors could see the potential for political blogging, especially for 
local or specialist blogs. As Ferguson summed up: 


... this opens up fantastic potential for MPs interested in improving the quality of 
consultation with constituents (Ferguson, 2004, p. 22). 


and 


... they (the jury) recognised an opportunity for alternative, informal voices to enter into the 
political debate (Ferguson, 2004, p. 24). 
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Methodology 

Bearing in mind the criticisms of MPs’ web sites mentioned in the literature, simple 
criteria for evaluating the blogs were drawn up. The criteria were divided into two 
parts, one being a snapshot analysis and the other the longer term view covering 
currency and activity of the blog. The first part covered the design, content and general 
impressions of the blog: 

* What range of issues does it cover? Obviously local issues are of concern to 
constituents but national issues should be covered too — what dees your MP 
think about Iraq/tuition fees for example? Can you see easily where youe MP 
stands on a current major issue (e.g. hunting)? 

* Does it foster two-way interaction and debate or is it all one-way traffic, what 
might be termed the blag/blog balance. Do the Sa stick to meaningful 
subjects? 

* Can you see evidence of personality coming through — your MP as a human 
being, not just part of the party machine? The use of links to other sites also 
suggests personality. Is the content original or does it emanate from party or 
constituency press rooms? 


The second criteria covered the currency and activity levels of the blog over a 
three-week period. Sites were monitored on a weekly basis and the following data 
collected: i 


* date of the most recent blog; 

* number of blogs over each weekly period; 

* number of blogs that received comments; and 
* total number of comments received each week. 


The next step was to draw up a list of MPs with weblogs. The e-savvy Guardian had 
already done most of the groundwork here, but there were a few additional candidates 
available on www.w4mp.com — the portal for recruiting researchers for MPs and the 
Economic and Social Research Council at Salford University (see www.esri.salford.ac. 
uk/ESRCResearchproject/links.php#blog). This resource proved to be updated more 
often so was the main source of bloggers. The number of MPs with blogs at the 
moment is so small that it was possible to include all of them in the analysis. They are 
(with constituency and party affiliation): 

e Austin Mitchell MP (Grimsby, Lab; www.austinmitchell.org/index.php); 

* Clive Solely MP (Ealing, Acton and Shepherds Bush, Lab; http://clivesoleymp. 

typepad.com/); 

e Richard Allan MP (Sheffield Hallam, Lib Dem; www.richardallan.org. uk); 

* Shaun Woodward MP (St Helens South, Lab; www.shaunwoodward.com/); 

* Tom Watson MP (West Bromwich East, Lab; www.tom- watson.co.uk/); 

* Boris Johnson MP (Henley, Con; www.boris-johnson.com); Ps 

* Sandra Gidley MP (Romsey Lib Dem; http://romseyredhead.blogspot.com); and 

* Peter Black AM (South Wales West, Lib Dem; http://peterblack.blogspot.comy/). 








As yet, there are no bloggings MSPs. 

The first snapshot analysis was conducted on October 15, but by the second, a week 
later, Sandra Gidley had set up a blog. It was decided that this should be included as 
the sample size was so small. 

There are numerous proxy blogs that have been set up for MPs. These are run by 
bloggers on behalf of MPs and usually cover what the MP has been doing, where he/she 
has visited and spoken, etc. However, these seem to be a move in the wrong direction as 
they remove the personal element that is crucial to a blog: it won't be the MPs’ personality 
shining through or reflected in the choice of links. It’s really more a form of online 
stalking. See www.perfect.co.uk/2004/09/boris-and-the -political-weblog-movement for a 
list. For this reason, proxy blogs are not included in the scope of this study. 


Results 

Currency and activity 

Tables I-III reveal some interesting results. It is comforting to note that in each week, at 
least three-quarters of the blogs had been updated within the last day. Taking Austin 
Mitchell’s persistently out of date blog out of the equation, these results are promising. 
Another blip is the lack of activity on Tom Watson’s blog from October 9 to October 
26, but he does apologise for this and states family reasons as an excuse. So in terms of 
currency, the blogs fare well. In terms of general activity rates, Sandra Gidley and 
Peter Black are by far the most prolific bloggers — each having posted up to four times 








Number Number of 
Number of receiving comments in 
Name of MP/AM Date of most recent blog blogs comments total 
Tom Watson ` October 9 2 2 18 
Clive Soley October 14 3 3 10 
Richard Allan October 14 3 3 18 
Austin Mitchell August 2 N/A N/A N/A 
Shaun Woodward October 14 3 1 2 
Boris Johnson October 15 5 5 254 
Sandra Gildey October 15 19 3 7 
Peter Black October 15 13 5 8 
Number - Number of 
Number of receiving comments in 
Name of MP/AM Date of most recent blog blogs comments total 
Tom Watson October 9 0 0 0 
Clive Soley October 21 l 1 3 
Richard Allan October 22 7 7 42 
Austin Mitchell August 2 0 0 0 
Shaun Woodward October 22 3 1 2 
Boris Johnson October 22 3 3 315 
Sandra Gildey October 22 21 7 30 
_ Peter Black October 22 14 3 9 
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Table I. 
Date of evaluation: week 
October 8-15 


Table II. 
Date of evaluation: week 
October 16-22 inclusive 
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Table F. 
Date of evaluation: week 
October 23-29 inclusive 








Number Number of 
Number of receiving comments in 
Name of MP/AM _ Date of most recent blog blogs comments total 
Tom Watson October 27 2 2 10 
Clive Soley October 24 2 2 4 
Richard Allan October 28 6 5 11 
Austin Mitchell August 2 0 0 0 
Shaun Woodward October 28 3 0 0 
Boris Johnson October 28 2 2 15 
Sandra Gildey - October 28 13 8 10 
Peter Black October 28 16 1 ae 








on some days. Again ignoring Austin Mitchell, our other bloggers have reasonable and 
regular levels of activity. What is noticeable is that the most prolific bloggers do not 
receive as many comments as those who blog only two or three times a week. For 
instance, 100 per cent of Tom Watson’s blogs in our evaluation period receive 
comments — as do Clive Soley’s and Boris Johnson’s. Conversely, only about one-third 
of Shaun Woodward’s blogs receive feedback. Gidley and Black have sometimes 
received less than a 25 per cent response rate and often many of their blogs receive no 
comments at all — do constituents want to be updated, but not inundated? Sandra 
Gidley’s blog has also only been in existence for a short time so perhaps people are not 
aware of its existence yet. Of course, the media exposure of each of our bloggers might 
also account for the number of hits their blog receives. 

Boris Johnson had had a fraught time during our three-week evaluation period, 
having been forced to go to Liverpool and apologise for comments in The Spectator 
about the Liverpudlian response to grief. It is interesting to see that outrage in the 
press and media in general also manifested itself on his blog. The 584 comments he 
received during the three weeks were largely in response tc his postings about 
Liverpool and his subsequent apology. None of the other bloggers or blogs has received 
anything like this level of activity. 


Range of issues covered by blogging MPs 

Table IV shows the range of subjects blogged by our representatives. This is an 
important part of the analysis as constituents need to know what their MP thinks 
about local and national issues. You should also be able to see areas of activity that 
reflect the particular interests and concerns of each member. 

Table IV provides a useful analysis, showing the balance of national and local issues 
in our politicians’ blogs. The column labelled “other” is for comments that do not easily 
fall into either category — it includes postings about blogging and more obscure 
subjects. It is not always clear whether a subject (council housing, hospital waiting lists, 
etc.) is local or national, but if no mention of a locality appeared in the blog it was 
classed as national. Roughly two months’ worth of blogs were categorised in this way. 
The analysis is rather crude as some of the blogs were not yet two months old and some 
of the bloggers were so prolific that a reasonable idea of the balance could be obtained 
from a shorter period. For Peter Black in the National Assembly, only issues specific to 
Cardiff or Swansea were considered local, while subjects pertaining to the whole of 
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Wales were classed as national along with issues relating to the wider UK. Topics were 
only listed once, even though they might have been mentioned in several blogs. 
Several interesting points become evident. First, only Sandra Gidley and Shaun 
Woodward provide a wealth of information on local issues — for these MPs local topics 
receive more coverage than national issues. Conversely, Tom Watson, Clive Soley, 
Richard Allan and Boris Johnson give local issues only fleeting and minimal exposure. If 
the purpose of blogs is to re-engage with constituents, not everyone is going to score well 
on this front. However, on all blogs, topics of major importance (hunting, Iraq, the US 
elections) are mentioned, so that is encouraging. Only two of our sample mention the 
recent party conference season, and overall party events receive little attention. Looking at 
the subjects, one can see areas of particular and personal interest to our MPs — something 
constituents might appreciate. The number of postings of Boris Johnson’s in the ephemera 
column suggests a more comedy element to his blog than the others. Boris Johnson’s blog 
is the only one in our sample where the constituency manager writes some of the copy — 
all our other bloggers have maintained 100 per cent the personal nature of their blogs, 
something that was very important to the jury in the Hansard Society study. 


Two-way interaction and debate versus one-way traffic 

This really is the crux. Bearing in mind that MPs’ web sites have been roundly 
criticised for being all one-way traffic and lacking signs of real interaction, this is 
where blogs can really make a difference. The questions are: 


* Do people respond to the blogs? 
+ Do members engage in debate with the respondents? 


* Do the responses stick to the issue and make salient points, or do blogs 
deteriorate into a general forum for abuse and pointless messages? 


What becomes apparent when analysing the blogs and the comments is that, on the 
whole, our blogging MPs do not enter into debates with people who comment on their 
sites. What usually occurs is MPs submit another blog on the same subject that might 
pick up on points made in the comments. Richard Allan uses this technique effectively, 
for example: 


Many thanks to those who commented on my earlier post regarding the scope on interception 
of email under the Regulation of Investigatory Powers Act (RIPA). This raises again a 
potential inconsistency with the Police and Criminal Evidence Act (PACE) powers that came 
out at committee ... (October 26, 2004). 


Clive Soley usually has a sub-section labelled “responses” at the end of one of his blogs. 
He refers to contributors by name and responds to their points in a genial manner. 
What is interesting about Clive’s respondees is that they use the weblog in a dynamic 
way and include links in their responses to illustrate their points. Almost all of the 
comments received are cogent and relevant too. 

Conversely, Sandra Gidley has an interesting posting on her blog: 


Made a decision at the weekend that I was not going to comment on the “comments” on the 
blog. I may decide to pick up on points that people make but if anyone has any genuine 
queries I am always happy to be e-mailed at gidleys@parliament.uk. 

The original aim of the blog was to inform, comment from Westminster etc so I shall try 
and get back to that (Tuesday October 26). 











Hardly a statement to inspire those who hoped blogging would revitalise political 
debate and proper two-way interactivity. Other MPs have also revealed their priorities 
behind their blogs: Clive Soley has a whole paper on his blog entitled “Why MPs 
should get blogging”. In this he states that blogging “immediately appealed to me as it 
was an additional way of talking to the public without the media in the middle”. 
Though he acknowledges the interactive nature of his blog, he also infers that he views 
it as a top-down mechanism: 


I had 14 comments in a constructive discussion on tuition fees in the run up to the recent 
legislation. It is easier to exchange views on subjects like this than by conventional letter and 
it has the added advantage of allowing the correspondents to see and comment on other 
people’s views [...]. the value I get from it is the very wide audience and my ability to get my 
views and arguments across unhindered. 


On the whole, fellow bloggers have remained positive and pertinent, but on the 
downside there are a couple of instances when responses have been wide of the mark. 
One of Richard Allan’s postings about trains receives homophobic mutterings, and one 
of Austin Mitchell’s blogs received the request to ensure that Tony Blair remains as 
Prime Minister so we can “continue to take the micky”. Such instances are few and far 
between and the level of activity and nature of responses suggests that people enjoy 
participating in online debates and have useful contributions to make. Following Boris 
Johnson’s indiscretions and dismissal from the shadow front bench, his blog was 
forced to close due to the volume of traffic: 


We are snowed under at the moment, but will be ready to resume later on this week with 
Boris’s next posting. 
All comments options will be frozen in the meantime (November 14, 2004). 


Perhaps the problem this time was that the comments did stick to the point. 


Evidence of personality 

One major concern expressed about MPs’ web sites was the lack of personality shining 
through and. the amount of information that appeared to emanate from sterile party 
offices. Equally, Ferguson’s work for the Hansard Society on blogs showed that the 
jury appreciated a lively writing style and found some formal blogs to be “boring” 
(Ferguson, 2004, p. 19). It is difficult to measure personality as it is rather an intangible 
quality, but there is evidence of personality in our sample, both in terms of content and 
also reflected in the information people link to. From Shaun Woodward’s blogging it is 
easy to see his specialised subjects — civil partnerships and smacking of children. 
However, most of his blogs are formal and read more like press releases than 
conversations cr diary items. The majority of his links simply link to other of his blogs 
or press releases so overall it is quite a dry site. For those commentators who think a 
blog should be like an informal chat, Woodward’s blog is more like a formal dinner 
party conversation. This formal style might also explain the lack of responses to his 
blogs compared to other MPs. Contrast this to Austin Mitchell’s (out of date) blog 
where his personality and off-message style is all too evident. For example: 


Yippee! Bloggy Days are here again. At last the summer recess of the most exciting session 
we've had since, at least, 2003. Too much blah to blog. Until Tony sent us away to sit on the 
beach at Cleethorpes in the hope that we'll forget about Iraq, WMD, Britain as Robin to 
America’s Batman, University fees, Foundation hcspitals, City Academies and other assorted 
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Tony tactics. “Look I have Come Through”, thinks the Great Helmsman. “Now I can get on 
with devising the meteoritic mish mash I want to foist on the party for the upcoming election. 
Divide and rule in education and health, Yobbo-bashing as a social strategy, squeezing social 
security while subsidising fat cats, giving more boozing time but ask people to drink less and 
holding the unions in check so they don’t frighten the incoming investment we're hooked on”. 
It looks so awful Austin will have to ask to be excused on grounds of taste. Or perhaps 
religion. 


Very typical of Austin, but nicely informal and no evidence of the party machine 
interfering. Austin’s content comes from a variety of sources that make interesting and 
different reading — his serious letters to the chairman of the Monetary Policy 
Committee, his amusing diaries produced for the House Magazine (Westminster’s 
weekly rag), his “opinions” (especially on tuition fees) and articles he’s written for the 
local press. As you might expect from someone who changed his name to Austin 
Haddock to help the local fish industry, there are comical elements to his blog but 
serious ones also get coverage. Overall, a good entertaining mix, but sadly vastly out of 
date. 

These represent the two extremes of our bloggers: other MPs sit somewhere in the 
middle. Clive Soley’s site lacks links to other information but doesn’t lack personality. 
Its style is casual (not overly) and chatty and he solicits views on different topics — 
some more heavyweight than others. He freely gives his opinions on topical issues and 
clearly is not afraid to speak his mind: 


Am [alone in thinking that the demonstrations by Fathers4justice show an incredible lack of 
responsible judgement? Would you really want a child to be in the care of a man who climbs 
onto a ledge of Buckingham Palace or throws powder in the House cf Commons at times like 
this? (posted on September 15). 


And on the Opposition: 


The Hartlepool election was a very interesting result. I don’t see how the Tories can regain 
support across the political spectrum unless they come up with more voter friendly policies. I 
also don’t think their current stance on Europe is helping them as much as they think 
(October 1). 


Clive also writes whole chapters on issues of particular concern to him and mounts 
these on his blog, lists them on the menu bar and links to them. A good way of 
providing more information than can be contained in a single blog. Overall, his blog 
does read like a discussion as he refers to fellow bloggers by name and responds to 
their postings in an open and amicable style. 

Tom Watson’s blog also follows this model, which some might consider surprising 
bearing in mind Watson’s recent appointment as a junior whip. Lots of personality 
shines through and at times his biog reads like someone thinking out loud or a stream 
of consciousness. Richard Allan also adopts a conversational tone that makes easy 
reading. Boris Johnson’s personality is evident in his postings and also his choice of 
links. These cover the publications he writes for, blogging MPs (from all persuasions) 
and PPCs. He also links to blogdex and theyworkforyou — reinforcing the idea of a 
community of bloggers. Jeremy Crampton mentions this community in his work The 
Political Mapping of Cyberspace when he states: 





... blogging is an activity which both takes place in and produces community. Bloggers link 
to each other, comment on each other's site, mention each other in blogs [. ..] thus creating 
friendships and mutual support (Crampton, 2003, p. 96). 


It is also interesting to see that comments about blogs appear regularly in our sample 
and that MPs link to other MPs who blog, despite party differences. Many of our 
sample also link to other political blogging and bloggers’ sites. Those commentators, 
like Crampton, who suggested bloggers form online communities, appear to have been 
vindicated. This is unusual as political alliances are usually formed on the basis of 
party allegiance. 

Sandra Gidley follows suit with a personable style and details on her week — 
including local quiz and swimming sessions. You get a real sense of what she has been 
up to locally and in the House through this blog. She also talks about colleagues in a 
friendly way, congratulating Charles and Sarah on their pregnancy, and mentions that 
her Conservative PPC opponent has a blog. She does link to information within her 
blogs but not a great deal — it reads like a diary really so there is not so much scope to 
link. Boris Johnson is not the author of all the postings on his blog: some come from 
Melissa, his constituency manager, but his style does shine through, as you might 
expect from someone so prolific a writer and on “Have I Got News for You”. 

The overall impression of the blogs taken together is that party machineries have 
little or no input or control over them. Although MPs may write broadly along party 
lines, these blogs are clearly the work of individuals and, apart from a cursory link to 
the main party site, party control seems to be lacking. Although the most recent party 
conference received mention on a couple of blogs, links to official party thinking 
contained in press releases or documents are few and far between. A conversational 
and chatty tone is the norm for most of our bloggers. 


Design 

Design is usually one of the key criteria in judging the merits of a web site, and has 
certainly been used for political web sites by various analysts. However, according to 
Rebecca Blood, blogs should have a “simple functionality” (Blood, 2002, p. 4) and very 
little else. Remember, commentators often link blogs to the early days of the net when 
people relied on simple HTML coding and web pages were often just pages of links or 
bookmarks. Bearing this in mind, all the blogs we examined were perfectly functional 
and easy to use. On the whole, the blogs followed a similar pattern — entries in reverse 
chronclogical order, links to other sites on one side, links to previous blogs on grouped 
subjects and sometimes archives of blogs grouped into months. This suggests that 
blogging software does not diversify much. However, a few of our MPs have gone the 
extra mile and added a few designer features. Austin Mitchell’s blog is designed 
externally but his is the only one where you can personalise the home page. This 
allows users to set up an account and choose how they want the front page to look. As 
the design team explains: 


As a registered user you can: 
* Post comments with your username 


* Send news with your username 
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* Have a personal box on the homepage 

$ Select how many news items to show on the homepage 
* Customize the comments 

* Select different themes 


* and lots of other cool stuff... 


though what “lots of other cool stuff’ covers is anyone’s guess. 

Boris Johnson’s blog also an extra feature lacking on most other blogs: a search facility. 
However, judging by the bylines given on most sites, almost all the bloggers have 
outside help in constructing and maintaining their blogs. Design is not such a key issue - 
and everyone's is satisfactory, though Sandra Gidley’s is alarmingly pink with pink 
text. 


Conclusions 

So, what is our overall impression of weblogs as a medium for MPs engaging their 
constituents? Although the sample size is small and MPs’ weblogs are still in their 
infancy, there are promising signs. In terms of activity, most of our MPs are regular 
bloggers, some adding to their blogs every day. In particular, Sandra Gidley and Peter 
Black are prolific bloggers. With the exception of Austin Mitchell, our entire sample 
maintains regular contact with the blogosphere. In terms of attracting responses, the 
picture is more mixed. Our two most prolific bloggers (Gidley and Black) receive some 
of the lowest response rates; other bloggers have almost a 100 per cent response rate to 
their messages. Perhaps this disparity is to do with the age ož the blogs, or perhaps 
constituents want to be kept informed but not continually updated. The frivolous 
nature of a minority of the blogs might be responsible, but Boris Johnson’s bout of flu 
did receive over 40 comments, mainly about the relative merits of whisky and Lemsip 
— not taken together, hopefully. 

Whether any of our sample engages in real two-way debate is also open to 
interpretation. It appears that some make little or no effort to refer back to messages 
received and Sandra Gidley even admits to-preferring to e-mail if people have genuine 
concerns. Both Clive Soley and Richard Allan are more geared up to posting responses, 
if not to individuals to aggregated postings on the same issue. It seems that weblogs 
are currently largely a top-down form of communication, though most of our sample 
appear to be grateful for comments received. In other senses, weblogs have bridged the 
gap between representative and constituents in that they make it easier to see what 
your MP has been doing recently, on the local, national and Parliamentary stage. This 
brings us to the balance of the content. The majority of the blogs we analysed only 
covered national or international issues - Shaun Woodward and Sandra Gidley being 
the exceptions here, though Peter Black does give local issues a reasonable amount of 
coverage. It would be interesting to ascertain what issues constituents would like to see 
covered in weblogs. What is clear across our sample is that party machineries have no 
input to the blogs — individuality and personality shine through almost without 
exception. There are no links to party press releases or initiatives and even the party 
conferences received only limited coverage. This is in stark contrast to MPs’ web sites 
which have become staid and dull — what will be interesting to see is whether, like 





MPs’ web sites, the party machines do take over once the band wagon is rolling and 
start to impose some rules on content and style. 

Perhaps Tom Watson was right when he claimed that weblogs would be the main 
method of communication for politicians within ten years. Research by the Economic 
and Social Research Council in 2002 indicated that the internet was the preferred 
means of political engagement for young adults: 

30 per cent of young people aged 15-24 say they engage in online political activity, compared 

to 10 per cent who had participated in more traditional politics. The Internet also wins out for 

those aged 25-34, with 28 per cent participating online, compared to 18 per cent offline (Ward, 

2002). 


Bearing in mind that only 40 per cent of 18-24 year olds voted in the 2001 General 
Election, any additional means of engaging this age group in the political process 
should be encouraged. 
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blogsphere 


Richard Rogers 
Media Studies, University of Amsterdam, Amsterdam, The Netherlands 


Abstract 


Purpose — To suggest methods and approaches to the study of relationships between the blogsphere 
and news, and to show, through a preliminary study, how the blogsphere makes particular political 
contributions through the manner in which social issues are discussed. 


Design/methodology/approach — The article provides a set of research questions as well as 
general methodological approaches to undertake empirical, comparative analysis of the blogsphere 
and news. It reports on a preliminary study of the contribution of the blogsphere to politics using 
semantic analysis. Hyperlink analysis of the right-of-center US political blogsphere is also provided in 
a figure. 

Findings — It was found that the contribution of the blogsphere to political issue formation is distinctive 
from that of the news, for the blogsphere provides to issues a poignancy not found in the news. 
Research limitations/implications — The reported study is suggestive of a particular contribution 
the blogsphere may make to issue formation. 

Practical implications — The article outlines a research agenda. 

Originality/value — The article seeks to reorient the study of the blogsphere. 

Keywords Worldwide web, Search engines, Politics 


Paper type General review 


Introduction: relationships between the internet and news 

When we read that bloggers, in 2003, focused media attention on dubious remarks 
uttered by Trent Lott, ultimately prompting the then Speaker of the House to resign, or 
when we read that bloggers dealt a decisive blow to the credibility of CBS news by 
exposing as fake the memo that alleged to show George W. Bush’s duty-shirking, 
ushering in a retirement and firings at the established broadcasting company, 
questions arise about the distinctive contribution blogs may make to news. Is the 
contribution made by blogs to media, as opposed to that made by other “spheres” of the 
internet, peculiar? Here it is argued, in the opening sections, that blogs reinstate and 
perhaps extend the reach of the informal of the internet, also making it more serious. 
After providing means and questions for the study of the blogsphere, and in particular 
mini-blogspheres, the author concludes with a finding from a small case study . 
concerning blogs’ contribution to the debate surrounding the FCC’s proposal to relax 
media concentration rules. It was found that an issue-oriented, political 
mini-blogsphere offered a particular poignancy to the issues, distinctive from the 
news sphere. (For the third “sphere” — the web sphere — see Foot et al., 2003; Schneider 
and Foot, 2005.) 

Prior to blogs, the relationship between news and the internet was discussed, 
mainly, in two senses. First, the internet was informal, both in its use and its contents. 
The informal made the internet into a special “real”. For example, it would put on view 
a picture of a scientist's domestic pet, beneath the more well-known list of publications, 








on his or her “homepage”, with bits and pieces of code and graphics picked up from the Poignancy in the 


web (the visitor counter, the animated gif). Showing the unpolished, amateurish and 
some of life’s backstage, the particular “real” on offer on the internet became well 
suited for “dirt-diggers”, not only in the sense of the Drudge Report (where dirt would 
be sent in and Drudge might report it), but also in terms of “native content”, where, for 
example, a search engine query ultimately led to the newspaper headline, “UN 
weapons inspector is leader of S&M sex ring” (Rennie, 2002). With the internet, 
importantly, we witnessed at the same time the circulation of the informal. “Stories 
circulating on the internet” became a well-known expression, but the significance of the 
expression lies not so much in its connotation of the web’s incredulity, however 
important it may have been for journalists’ leaving the dirt well alone, but more in the 
idea that many have heard, and more soon will know. Perhaps it is the reach of the 
informal towards the more formal that the net strengthens. Whilst not necessarily 
treated in the serious press as worthy of reporting, the internet stories, so-called, 
nevertheless could be circulated further even by the seriously minded to colleagues, 
friends and family with smilies, in the day when e-mail was seen as a means for relaxed 
communication as opposed to official or well-formatted letter-writing. (Relaxed 
communication has moved to chat software.) Bloggers could be said to maintain, or 
reinstall, the informal medium, which arguably was in decline owing to “new media 
concentration” issues, such as the idea that web usage has become more habitual (more 
regular site visitation, less “surfing”), the marketing reports that fewer and fewer sites 
receive greater majorities of all hits and such like. The blogsphere could be said to 
bring back, and lengthen the reach of the informal, and also make it more serious. One 
recent example is a case in point. That blogging employees at Waterstone’s Book store 
(UK) and at Google (US) would be fired for their irformalities expressed on the internet 
provides at least one indication that it remains a realm for “dirt-diggers”, however 
much more serious that dirt is perceived to be taken, in this case by book buyers and 
search engine users. Journalists also have been asked to stop blogging. 

The second sense in which the internet was discussed with respect to news was the 
idea that it put more time pressure on journalists, shortening the time and potentially 
lessening the amount of care that could be taken by news people, especially at the 
dailies, and weeklies. Whilst some could take solace in the expression that internet 
stories remained “too fresh to be true”, others watched news “catch up” to the internet, 
as the normal 24-hour collection activities of the news (again the back-stage) was put, 
in part, on the front end. Feeds, once reserved for newsroom eyes, were pulled in to web 
sites and displayed. News became fresher on the internet, as well as more readily 
available and searchable. Feeds themselves became transformed. Where once only the 
single source was available at a site — the press agency news ticker or story list from 
one organization — multiple-source sites (aggregators) and news reader software rose 
in use. By virtue of RSS (rich site summary or really simple syndication), bloggers and 
others fetch into their own “Daily Me” RSS readers not just the press agencies but 
newspapers and other news outlets (broadly defined or indeed redefined). Accessed, 
also, were news source.aggregators such as Google News and Yahoo News, where 
“news” changed. Google and Yahoo redefined news outlets as also primary sources, as 
opposed to seccndary sources only. Whitehouse.gov press releases, for example, are 
news to Google news. Through the aggregators, news, like blogs, arguably has become 
a “sphere” on the internet, in the sense that news is accessed by separate devices. Thus, 
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Table I. 
Top 12 blogs on 
February 19, 2005 


news has adjusted to the internet not so much in the sense of the shortening of editorial 
decision-making time about whether a story is too fresh to be true (however important- 
such an observation), but rather in its delivery formats. News also redefined itself with 
the internet: traditional and non-traditional news are placed together by aggregators 
that allow for news search. As with the rise of the “day trader” in the 1990s (more 
likely, the “freetime trader”), information tools have enabled the “freetime newspeople”. 
What is being reported is not gleaned from the place where news happens (both in an 
eyewitness as well as in a Lippmann sense of the official gateways of what has 
happened), but from the internet. 

With regard to the idea of bloggers as “freetime newpeople”, the blogsphere is not 
an amateur-only space, or something between lay and expert. It is all at once, or 
appears to be. It also has its own hierarchy, one source of which is reported in Table I. 
Of the top dozen blogs, according to technorati, five are “political”: two are by “elusive 
bloggers” (atrios and kos), fitting certain ideas about who bloggers are, whilst Andrew 
Sullivan is former New Republic editor and The New York Times writer, instapundit is 
a law professor at the University of Tennessee, and common dreams is a progressive 
NGO, filtering the news. 


' 


Blog studies 

Studies concerning blogs have defined the term, categorised genres of blogs (as 
“filters”, “personal journals” and “notebooks”, along the lines put forward by Rebecca 
Blood and others). They have discussed the extent to which they are a special breed of 
journalism and/or truly new media (like the homepage, purportedly). Writers have 
provided a number of case studies of particularly significant impacts bloggers have 
had on mainstream news or “elite media” (as in the Introduction above), and raised 
ideas about the part played by blogs — either “A-list” ones, or more readily as 
interlinked and intertextual “spheres” — in news, information provision, or more 
broadly the information society (see Niemen Reports, 2003; on individual bloggers see 
Drezner and Farrell, 2004). Since software (as at technorati.com) has made blogs into its 
own searchable sphere that also recommends information, ideas have arisen that what 











Top 12 Blogs Links Sources 
1. Boing Boing 17,810 11,277 
2. Instapundit.com 14,165 9,205 
3. Buzznet.com 97,049 7,485 
4. Deviantart 10,406 7,438 
5. Davenetics 7,546 7,389 
6. Gizmodo 9,459 7,204 
7. Penny — Arcade 7,873 6,844 
8. Daily Kos 9,869 6,825 
9. eBaum’s World 9,290 6,347 

10. Eschaton (atrios) 7,862 5,600 

11. AndrewSullvan.com — Daily Dish 7,257 5,450 

12. Common Dreams 9,534 5,385 


Notes: Political blogs are given in italics; the most authoritative blogs, ranked by the number of 
sources that link to each blog; last updated 2.03 am Pacific Time 
Source: Technorati.com 














is happening there is distinctive, and also that it is a “space apart”. For example, the list Poignancy in the 


of top books at technorati or at allconsuming.net, gleaned from references in blogs, is 
distinctive from bestseller lists, old media “critics’ picks”, or the list made by 
employees at a bookstore. Thus the “blogsphere” collectively becomes a new source — 
one, additionally, with a distinctive new media-style “ranking” method that leads to the 
recommendations (Rogers, 2004, pp. 1-33). 

In any case, as one set of authors has put it, the “predominant view of blogs [sees 
them] as news filters, and bloggers as highly interconnected”, though their study 
made the counter-intuitive finding (perhaps) that most are journals (diaries) without 
many links (or comments) (Herring ef al, 2004). (That there is also a “dark 
blogsphere”, many orphan blogs without inlinks and also without comments, would 
fit with the findings made in the late 1990s and early 2000s with respect to the web 
and its percentage of pages outside of the reach of search engine crawlers — often 
dubbed the “dark web”.) Arguably, however, it is the interconnectivity and 
intertextuality in not the entire blogsphere, but in mini-spheres, that guides the 
“predominant view”. This piece aims to contribute to the critical inquiry of the 
blogsphere, generally, by posing a series of research questions, as well as by 
reporting findings on a small case study. It asks to what extent do blogs, or certain 
mini-blogspheres, constitute a space apart? Indeed, we are interested in broader 
understandings of its significance. What is it for? For example, is it a conversation 
space unto itself, and as such is perhaps suited to provide a distinctive measure of the 
significance of news articles, as it does with books, too? Is it in (successful) 
competition with the mainstream media, in the sense of being distinctive in substance 
(beyond its already aggregated information recommendation culture)? Does it 
undertake some form of “public journalism” however much that may not be the right 
term? To begin, Rebecca Blood observes: l 


In my view, the journalism establishment isn’t paying enough attention to the weblog 
universe. [...] Bloggers say what they think, giving reporters a window into the views of 
those outside the media. Bloggers often find angles that professional reporters have missed, 
or ask questions reporters have neglected to ask. And bloggers do amazing research. 
Professional journalists, often working under extreme time pressure, may not have time to 
research a piece as thoroughly as they would like. Bloggers have no externally imposed 
deadlines, and no mandate to research equally the claims of both sides. [...] When bloggers 
link to conflicting or contextualizing material, smart reporters will further research and verify 
promising leads, and credit the bloggers who uncovered them (Blood, 2004). 


Most importantly, we are interested in the interaction between the news sphere and the 
blogsphere, and the broader place assumed by the blogsphere in media. There is a 
series of questions, with brief methodological considerations: 


* Dependency questions — Do blogs depend on news? The question is the extent to 
which blogs are parasitic on, or pose an alternative to, commercial media 
coverage of events. This may be ascertained by the amount and type of 
references to news sources, also with regards to which collections of events, 
issues and other they cover, and which they leave aside. One would query the 
blogsphere for newspaper article references relative to other references made 
(e.g. through link harvesting) (Halavias, 2003). Is the blogsphere referring 
primarily to itself? 
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* Shared source questions — News sphere research may find that there are fewer 
primary news sources than one may expect, with the remaining news being a 
matter of copying arid pasting of text, as well as images and video (see Figure 1). 
We are interested in whether this holds for the blogsphere. Do blogs talk 
primarily about few sources? Is their source range distinctive to that of the news? 


* What kind of political contribution is made by the blogsphere? — We would like to 
be able to derive insights that allow us to characterize how the blogsphere’s 
contribution to the political realm may be analyzed. Is it more of a literary space, 
a news space, or a political space? A subset of political weblogs would be studied, 
relative to the overall blogsphere, with the goal of understanding the 
blogsphere’s relative amount and distinctiveness of political content. 
Distinctiveness could be measured against the news sphere (and/or the web 
sphere). One also may study mini-blogspheres that deal with particular issues 
either as a matter of routine or in occasional postings, and ascertain how the. 
mini-blogsphere frames or “does” the issues as in comparison to the news. A. 
short case study on the blogsphere’s contribution to the FCC media concentration 
debate is discussed below. 


* Blog dynamics relative to the news — One may monitor the reaction of the 
blogsphere to a news event. The analysis concerns a comparison of attention 
cycles. This is a memory metric, if you will, but it also may be described in terms 
of the blogsphere’s commitment to certain themes or issues. Do blogs have longer 
-attention spans and greater “memory”? In asking that question where, for 
example, the “Bush bulge” is concerned, how long does it take for the blogsphere 
to “give up” on the story, relative to when, say, The New York Times did 
(Lindorff, 2005)? It should be noted here that there is a peculiarity to the 
blogsphere that aids in undertaking longitudinal analysis of the type mentioned 
above. Unless they go offline, blogs retain an archive of past postings as a matter 
of routine, “built into” the blog software. Additionally, blogs tend to have 
separate URLs for each posting (the permalink), which allows one to locate a 
mini-blogsphere from the past — something otherwise not possible on the web, as 
pages and sites are “refreshed”. , 


* There is the larger question of who produces the news and the events ~ Is it the 
. traditional media (and their traditional feeders) or the bloggers who make news 
and events? Could the blogsphere provide a measure of what could be safely 
ignored? (Certain blog engines, e.g. Waypath, monitor which news stories are 
cited most frequently on blogs, whereby the blogsphere becomes a collective 
. news filter.) When a politician makes a speech, it may be a news event. Is it a blog 
event? Does the blog space have “events” to recommend, or even its own world of 
events, so to speak? What is the quality of this world? 


Issue poignancy in the blogsphere 

The work described here was conducted at the Govcom.org Foundation workshop, 
“Making Issues into Rights?”, held in Amsterdam, June 21-24, 2004. A small study was 
undertaken. It concerned the contribution of the blogsphere to a classic political 
undertaking — requests for comments by the Federal Communications Commission 
(FCC) on proposed policies in the media concentration arena. Does the blogsphere 
“participate”, which blogsphere participates, and how? One elementary means of 
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Figure 1. 


w a 
HEE: 
BES 
28 3 
Bm a 
HES 
g Ee 
p 
= 


€00z Isn3ny ‘wepiysury ‘uoyEpuno.; 


BIO" WODAOD ay} Aq YorRasay “Woo yepmnBe Aq ny-neospy Aq uonordap pue sisAjeuy ‘E007 ‘OE quada — [¢ IsNBny ‘smo 


3/8000 Aq geq ‘splomAoy jo Aouonbay SuNeoIpUl SIPOU JO IZIS JIM ‘Sdd.MOS SSOIIMV SPIOMADY JO 9dUSLINIIO-OF 1IJON 


ƏUNQUL SMSN BUO a 
x A 


weBaja, juno Apoy POTN Af 
et V 








Aea sowy 


JS Buwow YBU euljuo pueja à 


; i et ereusny opey e. 
pscoay Áunog Aeon WopseyWeEXSUsH! a” sauny, UEYSNPULY Woo ayoedeisyOD ig 
eidgag 
IS ey, ` iiite SMON OO 
JES, PENON SORO PAO 
e ieuopewau; aodeBuyg pey fo Sm 
Bisy SMON [eUUEg, avr 
SHEN NEO SAV PIRI [WEW o . 
NeT 
sows Aepun: 
Pessoa: NN 11 Aepuns y 
fewnor epsany uey 
AN yeu'smMaUuS CI di 
BUVO BÁLE uesnNn Jasqeapr PL, 
2\S¥ NNO ayssny Z ARMIES g 





BOSS SSƏJd JƏjuj ğ FEN UNCOJ Ə! 
Asnouey SUL 





PNVETY sanay 
gwau NNI 


RE OpazE9 Aepung uojsopeyD Å 
enezeg UNSŠPEUD y 


apoj nyeg 


worsen, 
PILƏLUY JO BOIOA, 


N 





362 


demarcating the blogsphere’s contribution — that is, which blogsphere is engaging in 
the issue surrounding the FCC hearings on “localism” — is to search technorati or other 
blog engines, or even a popular search engine, such as Altavista (with the date range, 
January 1, 2003 to July 19, 2004) for Blog AND FCC AND localism, where there are 144 
returns. These 144 blogs (approximately) would be the candidates (or’ overall 
population) from which a mini-blogsphere for a particular issue may be sought. One 
subsequently analyzes the extent to which these 144 pages are densely interlinked, 
referring to the same sources (e.g. news or other blogs), as well as keywords or phrases. 
To do so, one “scrapes” (or copies) all the returns. One may craw! each of the pages (the 
specific postings or permalinks), and enquire into interlinking between them. One also 
can scrape all the pages, and query them for keywords or phrases. This general 
technique would provide some understanding about whether the “predominant view” 
(described above) holds of blogs being highly interconnected, and also intertextually 
related, in the sense of a conversation with similar (frames of) references (see also 
Figure 2). 

I would like to put the small case study of the (political) blogsphere ‘and its 
contribution to issue politics (as well as social change) into some perspective with the 
findings that were made, in a parallel project, with respect to another important set of 
actors — non-governmental organizations, also contributing to the FCC debate on 
media concentration. The point to be made concerns the larger context of expectations 
made by actors — bloggers as well as NGOs — of media coverage. Analyzing the 
database prepared by the International Center for Media Action (ICMA) of the 
“hundreds of groups that took action to stop FCC deregulation of media ownership” 
provided us with the opportunity to understand how NGOs spend their time and 
money when campaigning for change — in this case the FCC’s proposed relaxation of 
laws restricting media concentration, for example, allowing one corporation to own not 
35 per cent, but 45 per cent of the national television market, among other proposals[1]. 
The procedure requires a participation component (public hearings and requests for 
comments). Of those hundreds of groups campaigning against media concentration, we 
found that the ones with the greatest amount of resources concentrated their activities 
on events and working the press, whereas those with the least amount of resources 
laboured on comments. Whilst only the smallest indication of the significance accorded 
to “networking” at events and penetrating the press by issue-oriented actors (and 
however unfair this portrayal also might be), it was nevertheless a sobering finding. 
Informal democracy, so to speak, was given more weight than formal democracy (see 
Figure 3). Thus to be critical of the blogsphere’s dependency on news and its desire to 
shape it could detract attention from the larger assumptions made by politically 
oriented actors about the significance of getting press. The US government shares 
similar assumptions, however much the tactics may be different. The New York Times 
reported the recent warning issued by the US comptroller general, after a series of 
cases where federal agencies were working the press, circulating unattributed, “cans” 
as news: 


In fact, it has become increasingly common for federal agencies to adopt the public relations 
tactic of producing “video news releases” that look indistinguishable from authentic 
newscasts and, as ready-made and cost-free reports, are sometimes picked up by local news 
programs. It is illegal for the government to produce or distribute such publicity material 
domestically without disclosing its own role (Kornblut, 2005). 
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Figure 4. 

Mini blogsphere clustering 
around FCC and public 
interest, hearings, 
localism, diversity and 
ownership, where size of 
nodes indicates frequency 
of sources mentioning the 
keywords, and placement 
of nodes according to 
centrality 


Figure 5. 

Mini news sphere 
clustering around FCC and 
public interest, hearings, 
localism, diversity and 
ownership, where size of 
nodes indicates frequency 
of sources mentioning the 
key words, and placement 
of nodes according to 
centrality. Data by 
googlenews.ccm, scraped 
by Govcom.org, and graph 
by Réseau-Lu by 
aguidel.com — 
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Figure 6. 

Semantic analysis of mini 
news sphere around FCC 
and public interest, 
hearings, localism, 
diversity and ownership, 
where size of nodes 
indicates frequency of 
sources mentioning the - 
key words, and placement 
of nodes according to 
centrality. Data by 
googlenews.com, scraped 
by Govcom.org, and graph 
by Réseau-Lu by 
aguidel.com 


i 

To begin to gain an impression of the distinctiveness of the blogsphere’s contribution 
to the FCC media concentration issue, we undertook comparative research of the 
substance of the news and that of the blogsphere. The researchers chose the terms FCC 
and coupled it with diversity, concentration, localism, hearings and ownership, and 
queried engines, looking into the quantity of sources and mentions per term, and also 
the extent to which the sources concentrated themselves on one or more terms. For 
news, Google News was queried, for blogs Blogpulse. We found a relatively small 
quantity of blogs contributing content to the issues, and a much larger quantity of 
press (see Figures 4 and 5). Focusing on the FCC and public interest, the news, it was 
found, in a textual analysis, concerned itself with a set of terms different from that of 
the blogsphere. Whilst the news contained many procedural terms, the blogsphere 
appeared to “bring the issue home” by connecting it to Howard Stern and Oprah 
Winfrey, two prominent and popular show hosts, who were the source of indecency 
complaints made to the FCC, and potentially faced being “silenced” (see Figures 6 and 
7). Here one could argue that the blogsphere’s contribution to politics lies in granting 
the issue a poignancy less present in the news. 

The author would like to thank the participants of the Govcom.org workshop, “The 
Life of Issues 9: Making Issues into Rights?”, held in June 2004. Special mention is 
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made of Andrei Mogoutov, Jodi Dean, Zachary Devereaux, Seeta Peña Gangadharan, 
Catherine Borgman-Arboleda, Philip Napoli, Gerri Spilke, Erik Borra and Koen 
Martens, who worked on the blogs projects as methodologists, researchers or 
programmers. Thanks also to the workshop co-organizer, Noortje Marres (University 
of Amsterdam), our host, Erik Kluitenberg (De Balie Center for Politics and Culture, 
Amsterdam) and Becky Lentz (Ford Foundation, New York), who made the event 
possible. 


Note 


1. The International Center for Media Action’s database project resulted in the Media Policy 
Action Directory (2004). See also FCC (2003). 
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Purpose — To confirm that the purpose of the FESD project has been to provide a framework Accepted 24 February 2005 
contract for the whole public sector covering the purchase of an EDM system, technical and 
organisational consulting for implementation and organisational change. 
Design/methodology/approach — The project took the approach of working closely together with 
11 partnering organisations on developing the functional requirements for the system and 
participating in the tender negotiations with the bidding consortia. This has proved valuable, since the 
project has gained a profound legitimacy for its demands and a strong basis for the roll-out in the rest 
of the public sector. 

Findings — The results of the project are manifold: for the first time in the Danish public sector a 
mutual framework contract has made it possible to put the same requirements forward to the bidding 
vendors. It has made it possible to develop mutual technical standards and to develop standardised 
work processes supported by the systems. Furthermore, a number of long-term findings will become 
evident over the next two years when the implementation prejects begin to show results. 

Practical implications — Originally it was one of the major tasks of the FESD project to show 
efficiency gains and return of investment within the project's life span. This has not been possible due 
to the fact that the implementation projects in the partnering organisations are far from finished. Also, 
efficiency gains are not always part of the success criteria and it may turn out that efficiency gains 
weigh more in the minds of planners than in the real implementation projects. 

Originality/value — The article is a report from a country highly esteemed for its efforts in pushing 
public digital administration in order to create better service and higher efficiency. 

Keywords Document management, Electronic document delivery, Public sector organizations, 
Organizational change 

Paper type Case study 


What is the purpose of the project? 

Over the years many public organisations have completed more or less successful 
implementations of electronic document and case management systems (EDM 
systems) in whole or parts of the administration. Often the projects did not live up to 
expectations. The cause of the problems originated mainly from the fact that the 
projects focused more on the systems and less on organisational change and 
simplifying of work processes as the main objectives and firm project management as 
a means to reach the objectives. Effective project management and change of work 
processes with a clear aim to obtain efficiency are therefore some of the pillars carrying 
the joint public EDM project called “FESD"[1]. 


Emerald 


Aslib Proceedings: New Information 


From an archival point of view, the FESD project is an invaluable chance to ensure tas Perspectives 
that, when the results of the full digital administration penetrate the whole of the public E 
sector, the effect will not solely be simpler work routines and better services to the P Erheralg Group Publishing Limited 


public, but also that documentation practices and legislative demands are met, made pot 10.1108/00012530510612095 
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easier and the whole life cycle for electronic archives from birth through to archiving 
and preservation is simplified. 


The background 

Project eGovernment{2] has been initiated by central government and regional and 
local administrations in order to promote and coordinate the transition to e-government 
in the public sector. The project is led by a joint board made up of the permanent 
secretaries from five ministries, the managing directors of Local Government Denmark 
and the Association of County Councils in Denmark, and a representative from the 
municipalities of Copenhagen and Frederiksberg. The board is served by the 
IT-Technical Centre in the Ministry of Science, Technology and Innovation and the 
Digital Taskforce, and a secretariat consisting of about 20 employees from all parts of 
the public sector, which is based in the Ministry of Finance. 

In 2002 the Board wished to promote the use of EDM systems in the public sector, 
and asked the Agency for Governmental Management to complete a thorough 
pre-analysis of the problems attached to the implementation of EDM systems in the 
public sector. Based on the results, the FESD project was established in August 2002. 

The starting point for the project was that the whole public sector joined forces to 
create the possible grounds for a mutual EDM solution. This was done partly by 
partnering 11 public organisations, and partly by establishing a framework for a 
contract, which after a period was opened for all public organisations to use. 

The 11 public organisations are a mixture of local, regional and central government 
organisations[3]. All in all, the 11 organisations represent approximately 4,500 
workplaces, of which 3,200 will be covered by the first implementation wave with the 
rest will follow in later implementations. The number of workplaces will be much 
larger when institutions underlying the ministries and county councils get involved in 
the project. 

The project is estimated to last two and a half years, with an expected closure date 
in March 2005. The project is staffed by two full-time employees from Local 
Governments Denmark, one full-time employee from the Agency for Governmental 
Management, one half-time employee from The Digitial Taskforce (originally from the 
Ministry of Economic and Business Affairs), and two full-time employees hired directly 
for the project. Finally, the project has a full-time project manager from The Digital 
Taskforce (originally from the Ministry of Finance). Furthermore, another sub-project 
under the FESD project has been established in the Ministry of Science, Technology 
and Innovation to develop technical standards and data models, which are a 
fundamental part of the FESD project. This project is staffed by four employees, one of 
whom is the project manager. 

In order to get the best possible support and completion for the project, it is being 
followed closely by a steering committee consisting of a director from Local 
Governments Denmark, the director of the Digital Taskforce, representatives from the 
Association of County Councils in Denmark, National Procurement Ltd, the Ministry of 
Science, Technology and Innovation and the Agency for Governmental Management. 
Furthermore, the project is being followed by an external group with approximately 35 
public organisations from all over Denmark. 

After the expected closure date in March 2005, a new FESD project has been 
established within the Agency for Governmental Management. The aim of the 





following project is to maintain and further develop the framework contract and ensure 
that the public sector invests to the largest possible degree in a FESD solution when 
they buy new EDM solutions or upgrade their existing ones. 


The project in general 

The project provides support for implementing and restructuring the work processes 
to the 11 partner organisations with a focus point on the tender of the system. The 
tender contains requirements for a new system but also to the same degree 
requirements for methods and experience with organisational implementation. It is the 
intention that the experience gained will be available for all public authorities to use 
after the project has ended. Guidance with best practices on implementation methods, 
how to get started, etc., willbe published on the project’s web site (see www.e.gov.dk/ 
fesd) at the end of the project. 

The overall aim of the FESD project is that more public organisations will become 
digital and as a result profit both in terms of higher efficiency and better quality. On 
one hand to get easier, simpler and more efficient work routines‘and better cooperation 
between employees, work processes and IT systems internally. On the other hand-there 
is an external aim to establish an administrative digital foundation in cooperation with 
citizens, businesses and other authorities. This external aim involves a strong focus on 
standardisation of communication and exchange of information, cases and documents. 

Many public organisations want to head down this road but seem to hesitate 
because of less fortunate-experiences in the past. The systems on the market have not 
always lived up to demands and expectations, and the restructuring of work processes 
has been shown to be difficult. The biggest challenge for public organisations will 
therefore be to rearrange their-daily routines and work processes to proceed on a fully 
digital basis. The FESD project will, in addition to standardisation, have a strong focus 
on organisational change to an efficient digital administration. 


Visions and aims: approach 
The vision for the electronic case and document management solution is: 


* That an organisation in the public sector will be in need of only one solution for 
registration, management, control and archiving of all cases and documents. The 
selution should provide a complete overview and control of the documents 
within a given case or relating to a legal body. The intention is that any relevant 
employee without intensive training can perform registration, management, 
control and archiving tasks for all the documents relevant for the consideration 
of a case. 


* That the solution can be implemented in such a way that the organisation can 
harvest measurable and concrete efficiency gains higher than the initial 
investment. 

* That the solution can be integrated with the relevant specialised in-house 
systems, which are used-by the organisation through the use of recognised and 
system independent standards. 


* That the solution can be implemented quickly and effectively without extensive 
use-of external consultants. 
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+ That the solution can manage standards as declared from the Ministry of 
Science, Technology and Innovation regarding, for example, XML and digital 
signatures. 


In continuation of the strategy for Project eGovernment, the overall scope is to provide 
an efficient penetration and use of electronic case and document management in a way 
that carries the largest possible benefit for the single organisation and the public sector 
as a whole, and minimises the collected public expense. The FESD project must 
therefore reduce the risk of implementing electronic case and document management 
for the single organisation by offering a higher security for a successful 
implementation through common models and standards. 


The tender 
Following the rules for EU tenders, the FESD project had an initial prequalification run 
in March 2003 resulting in 17 requests for participation. From those 17 requests, eight 
vendors were chosen to bid in the subsequent project competition. One consortium 
withdrew their participation after reading the competition material, leaving seven 
remaining. The project competition resulted in three winners. The winners were 
announced shortly after the summer holidays of 2003. After the competition a new 
phase was initiated when the project entered into tender negotiations with the three 
winners. All through the autumn of 2003 the negotiations went on, with four rounds of ` 
negotiations, each beginning with a more refined offer from the consortia. Finally, on 
January 27, 2004 the framework contracts with the three consortia were signed. 
Through the framework contract all public organisations in Denmark are able to 
enter into a contract with one of the consortia behind the agreement for the delivery of 
an EDM solution and technical and organisational consulting on the terms and prices 
negotiated in the framework contract. The three consortia are a mixture of large: 
international companies and smaller Danish companies. They are: 


* Accenture with Traen Information Systems and Rambøll Informatics as 
subcontractors. The system offered is Acadre, developed by Traen Information 
Systems. 


* CSC with Scanjour and Rambøll Management as subcontractors. The ee 
offered is Scanjour Captia. 


* Software Innovation with Rambøll Management as subcontractors. The system 
offered is Software Innovation’s own system Public 360. 


The contracts also include special deliveries to the organisations participating in the 
FESD project, which the project needs to establish the experiences learned and. best 
practices to publish at the end of the project. Furthermore, the contracts include 


- optional possibilities to buy services, such as running of the daily operations, further 


development, support, education and extra modules for specific work areas. 


The partnering organisations 

The 11 partnering organisations constitute the first wave of public organisations using 
the framework contracts, They are all well into their projects. They have prepared or 
are preparing their own organisations involved in the project and will have completed 
the following when their implementation projects begin: 





the project documentation (a prospect, a risk analysis, an analysis of interested 
parties, etc.); 
education in the project management method Prince2; 


education of local process consultants and started work process analyses 
focusing on the processes relevant to the EDM project; 


used structured tools in estimating the potential and project costs; 


developed best practices on certain areas in cooperation with the other 
partnering organisations and the FESD project; and 


* established a network with the other partnering organisations. 


. 


The organisations have each developed their own visions and goals for their 
implementation. Even though there are differences between the success criteria in each 
of the projects there are a number of similarities. The success criteria mutual to them 
all are: 


* efficiency — efficiency and more effective work processes; 
* management — better management information; 
* technology — that the project contributes to an up to date infrastructure; 


* openness — that the project constitutes a digital ground for openness and 
transparency in administration for citizens; 


* employees — that the project provides a more attractive workplace where 
knowledge sharing is supported by technology; and 


* change — that the project is a driver for change and rethinking within the 
organisation. 


The organisations see a profit in the close cooperation and sharing of experiences with 
others in the same situation, for example, in working together on best practices. 


The archival aspects of the FESD project 

The above descriptions have emphasised the organisational changes that the FESD 
project works strongly towards. One of the less frequently mentioned consequences of 
a digital administration is often a restructuring or a closure of the administrative 
process involved in the registration practice, which in Denmark usually takes place as 
one central function within the organisation. Seen from a documentation perspective 
this work process has been essential in securing the quality of registered metadata — 
the critical success factor for the later search processes. It is very rare these days to see 
an EDM project where the search for efficiency gains does not involve a restructuring 
of the central registration, moving the process out to the academic staff where they 
each have to register documents, cases, e-mails, etc. In other words, the former work 
process, comprising highly trained professionals, is now stretched thinly throughout 
the whole of the organisation. It is a well-known fact that the registration process is not 
being prioritised by the rest of the staff, causing a fall in the quality of registered 
metadata and thereby in the whole of the documentation as such. The registration 
process simply does not have the proper attention as the primary part in the creation of 
a knowledge-based organisation. 
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These changes in priorities are documented in the requirements the partnering 
organisations put forward when the FESD project, together with the organisations, 
wrote the functional requirements. It was a strong wish thet the systems were as 
transparent as possible in the registration process in daily use — so transparent that 
the users almost do not discover that they are actually registering documents and 
e-mails. 

The FESD project has prioritised that this transparency has to be closely followed 
by a high quality in the documentation process. This has been done because it is the 
project’s responsibility to observe the administrative legislation on the 
administration’s documentation work, and also to secure the quality of the archives 
being created. Quality is understood as the highest possible degree of documentation of 
the administrative process, which has taken place. The following passage can be found 
in the functional requirements: 


Fundamental in the understanding of an electronic case and document management system 
in administrative organisations is the close coherence between work processes and their 
documentation. It is a part of the rationale for an EDM system that it should be able to 
document the information constituting the basis for a ruling or decisions, which are decisions 
taken in the process of a consideration such as they appear in the cases and documents, which 
the system manages. 

One of the most vital functionalities in an EDM system is the registration system in use. It 
should at any time be possible to retrieve and identify any given case and its contents as it 
looked at any given time in the life cycle through the placing in the central registration 
system and the metadata connected to the case and the documents. 

But the case file and its contents (its documents) are not alone in documenting the process 
of consideration. In order for the documentation to be adequate there should exist context 
information about the work process or the process of consideration, which the case 
documents — how the case developed, who worked with the case, who took the decisions 
underway and who took the final decision. The three parts — the content, the context 
information and the registration system — together constitutes the needed documentation. 

The system should therefore be able to store context information (information about 
routing or workflows) together with the case, its documents and metadata[4]. 


The FESD project therefore equals documentation with the “sediments” of written or 
other material, which a process of consideration or work process leaves behind, but it also 
includes the actual getting through the given work process. It is important that this 
process information is stored together with all other documentation since it should be 
regarded is an integrated part of the collected organisational documentation. At a later 
appraisal procedure the case, its documents, the placing in the registration system, the 
metadata and the actual process information are stored together as a unit. If we can secure 
a very high degree of automation in the registration process using the systems, maybe 
then it will be possible to stop the rapid fall in documentation quality, which has been the 
case for the past 30 years, and thereby regain a high quality in the documentation again. 


The standardisation efforts 

In parallel with the FESD project, important standardisation efforts are made with 
regard to the exchange of data between authorities, and between authorities, the 
private sector and citizens. The first two projects have been completed. On September 
1, 2003 all public organisations were obliged to be able to exchange information 
electronically with each other and, more importantly, they could refuse to accept 





non-electronic information sent to them from other authorities. The next step was 
taken on February 1, 2005, when the same rules were applied to the exchange of 
information with companies and private citizens, although the refusal part was not 
applied here. It is still — and will be for quite a long time — possible for a private citizen 
or company to correspond with the authorities through personal encounters or 
paper-based mail. The key issue in the two projects, besides the short-term gain in 
savings on paper-based mail, are two-fold: 


(1) automated exchange; and 


(2) the use of digital signatures and encryption methods when sensitive and 
personal information is exchanged. 


Another project concerns electronic invoices. As from February 1, 2005 it is no longer 
possible to send an invoice to the public sector unless it is electronic[5]. It is foreseen 
that this will cause problems, especially for smaller companies with local deliveries, 
but a firm line has been chosen to push forward the digital development process. 

With regard to the exchange of information, a matter for attention, besides the 
technical issues, is to avoid too much manual work in sending and receiving 
information and to ensure it is correctly placed within the registration system in the 
receiving organisation. The next step is therefore to automate the exchange as much as 
possible: allowing A to send information to B and for B to receive it and place it 
automatically within the registration system. These processes demand standards for 
registration practices, classification, metadata and much more. 

The work with standards has already been in progress for several years in the 
so-called DokForm group within the Ministry of Science, Technology and Innovation. 
The group has identified a minimum of metadata that should be sent with the 
information when it is exchanged. The ongoing standardisation work has prioritised 
transforming the DokForm standard to a set of technical standards, which underwent a 
hearing process in the summer of 2004 and were approved in October 2004. 

The purpose of the FESD standardisation group is to promote interoperability, 
which is understood as standardisation of, for example, protocols for the exchange of 
information and defined modules in EDM systems. This implies that there is a need for 
creating mutual core functionality for public EDM systems, which also provide for 
further uniform development of the systems’ core. 

The core should provide possibilities in the following areas: 


* work processes across organisations; 


* moving and changing of fields of responsibilities within levels and between 
levels within the public sector[6]; and 


* the movement of selected assignments between authorities. 


The necessary common public technical standards in these areas are to be provided 
over the coming years. The more concrete aims concern technical standardisation in 
the following areas: 


* a mutual data model for EDM systems; 
* uniform add-on modules for EDM systems; and 
* uniform interfaces to other systems. 
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In the tender, the FESD project requires a strict observation of the chosen standards for 
data modelling and registration. The standardisation work follows the Norwegian 
standard NOARK 4.1, which is considered a de facto standard, and also the European 
standard MoReq. Others will be identified during the project’s lifetime. The 
standardisation work is developing a number of standards on different areas every six 
months, giving the vendors on the framework contract an additional nine months to 
implement them into their systems. 


Status — with less than a month left 

The FESD project is at this moment left with no more than a month to completion. It is 
time to reflect on the lessons learned and, just as important, what is happening next? 
The project has delivered most of the required products. A set of functional 
requirements covering about 80-90 per cent of general needs has been established 
through a unique cooperation incorporating municipalities, county councils and central 
administration. Mutual ways of working were found and prejudices were put to shame. 
The tender was completed and three consortia signed the 4-6 year framework contract. 
The FESD project has contributed to the whole decision making process of finding the 
right consortium and the right system and afterwards to the implementation process 
by publishing a number of practical guides on our web site on how to deal with these 
situations. The implementation projects in the partnering organisations have not come 
far enough to be able to deliver any reports on efficiency gains. These cannot be 
measured until at least a year or two has passed and the system has found its natural 
place within the organisations. 

The latter could very well be one of the biggest lessons learned from the project. It is 
vital to have visions and to be able to formulate the big goals, but it is equally 
important to acknowledge the fact that it is necessary to take small steps in the 
direction one wants to walk. Efficiency gains are not visible until at least one year after 
implementation. In fact, one should expect a loss of momentum in the work done while 
employees get accustomed to the system and the system gets adjusted to the work 
processes. 

Being more efficient by using technology is a big issue within the planning parties 


- working with digital administration, but it may be that IT within the administration 


does not make us more efficient, but simply provides us with easier and standardised 
ways of communicating, enabling shared knowledge and secure proper 
documentation. There are certainly doubts about whether it makes us more efficient 
in the short term. However, organisational change is taking place rapidly, and this may 
turn out to be one of the biggest gains from the project and the standardisation efforts 
— the ability to quickly restructure and change organisations as necessary with the use 
of IT. 

Looking ahead, the FESD project is continuing under other forms. The framework 
contract is runsfor another 3-5 years and will be maintained and developed further over 
the years by a centrally placed project organisation working closely together with the 
FESD standardisation group. The consultancy and marketing efforts will be 
maintained by the sectors themselves through the organisations, Local Governments 
Denmark, the County Councils of Denmark and the Agency for Government 
Management. The endeavour to provide high quality EDM systems based on mutual 
standards at competitive prices, implementation consulting and further development 





will continue for at least the next three years. It will hopefully bring better systems and 





better implementation projects to public organisations and maintain Denmark’s fine 
position as a country with one of the highest levels of penetration of IT use in the public 
sector, both internally and towards citizens and companies. 


Notes — 


1. 


2 
3. 


D 


FESD is a Danish acronym for Joint Electronic Case and Document Management 
(Fellesoffentlig Elektronisk Sags- og Dokumenthandtering). 


For further information, see www.e.gov.dk 


The organisations are the following: local governments (the municipality of Ribe, the 
municipality of Ringsted, the municipality of Skibby, the municipality of Bramsnæs and the 
regicnal municipality of Bornholm — the two authorities have been merged into one); 
regional councils (Ribe County Council and Sønderjylland County Council, together covering 
most of the south of Jutland); central organisations (the Ministry of Science, Technology and 
Innovation, The Ministry of the Environment and the Ministry of Social Affairs). 


. The text is inspired by the author’s PhD thesis on the subject and furthermore based on 


many cf the thoughts of David Bearman, Pittsburgh University, and Ulf Andersson, Astra 
Zeneca. The PhD thesis is unpublished and in Danish. It was written between 1995 and 1999 
at the Institute for History at Copenhagen “University and at the State Archives in 
Copenhagen. Some of the work was done in a three-month period at SLAIS, UCL in 1997. The 
functional requirements were published as part of the collected tender material in April 2003 
on www.e.gov.ck/fesd 


. These and many similar projects are more thoroughly described at www.e.gov.dk 
. The public sector is planning the largest organisational change seen in 35 years. The 


municipalities, of which there are 270 today, will from January 1, 2007 be merged to only 100 
municipalities, the county councils as we know them today will disappear and become 
regicns with only health care as the sole area of responsibility, moving, for example, the 
administration of environmental issues to the municipalities. 
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Received October 2004 Abstract 

Accepted January 2005 Purpose — Television has long been cited by viewers as their primary and most rusted s source of 
news, especially in relation to news of national and international affairs. Aims to explore the issue of 
trust in the television news. 
Design/methodology/approach — The paper combines narrative and analysis. Questions whether 
public trust in the BBC was damaged by the Hutton inquiry: would the BBC's reputation as the 
nation’s premier news service be tarnished in the longer-term and had public trust in journalism been 
severely compromised. 
Findings — Events that followed the transmission of a report about the veracity of the government’s 
case for going to war carried by a BBC radio news broadcast on 29 May 2003 called into question the 
Corporation’s competence as a reliable news provider. The story alleged that an informed source had 
told BBC correspondent Andrew Gilligan that the government had exaggerated the immediacy of 
dangers posed to the west by Saddam Hussein’s weapons of mass destruction. The source who was 
eventually exposed was a Ministry of Defence expert on Iraq, Dr David Kelly, who later killed himself. 
The Prime Minister ordered a public inquiry into Dr Kelly’s death, led by Lord Hutton, who severely 
criticised the competence of the BBC’s senior management and the quality of its journalism practices. 
These conclusions prompted the resignation of the Corporation’s Chairman and Director General. 
Hutton’s findings had wider implications for the future governance of the BBC and invoked 
far-reaching questions about the trust that the public could place in journalism. The evidence indicates 
that while the public felt that the BBC had been culpable for failing to launch its own internal inquiry 
into the Gilligan report, the public perceived this incident as a one-off aberration rather than as being 
symptomatic of some wider malaise. Indeed, the Hutton inquiry had impacted more upon public trust 
in the: government and led people to question the independence of the Hutton inquiry. 
Practical implications -- While trust in journalists is far from universal, the public differentiate 
among journalists in terms of the news organisations they work for. Among these, the BBC remains 
one of the most widely trusted. 
Originality/value — An exploration of the issue of trust in the television news following the Dr 
David Kelly/Andrew Gilligan report on “The Today Programme” and subsequent Hutton enquiry. 


Keywords Information media, Television news, Trust 
Paper type General review 


Introduction 
The news has long been regarded as one of the most important services provided by 
television. This observation is underpinned by public opinion concerning television 
Emerald (Gunter et al, 1994; Towler, 2003) and by broadcasting legislation (Gunter, 1997). Yet, 
in recent debates, questions have been raised about the significance of televised news 
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Following the Hutton inquiry into events leading up to the death of government 
scientist Dr David Kelly in 2003, however, serious questions were raised about the 
quality of news journalism in the BBC that could potentially have led to wider concerns 
about mainstream news broadcasting. Is there any evidence therefore that television 
news is losing public trust and that the nation’s flagship news broadcaster — the BBC 
— is no longer the premier source of news in the country? 

One factor underpinning the reputation of televised news in the UK is that it 
represents a genre that is seen as synonymous with the core principles of public service 
broadcasting (PSB). In debates about whether PSB should be preserved, changed or 
replaced, the provision of news has occupied a central position (Hargreaves and 
Thomas, 2002). The latest communications legislation in Britain has stipulated that 
one of the core purposes of PSB is to ensure that such services facilitate “... civic 
understanding and fair and well-informed debate on news and current affairs ...” 
through the provision of “... a comprehensive and authoritative coverage of news and 
current affairs in, and in different parts of, the United Kingdom and from around the 
world” (Communications Act, 2003, Part 3, Chapter 4, 264(6)[c]). 


Television as main source of news 

The acquisition of news information has long been endorsed by viewers as a primary 
motive for watching television (Rubin, 1983) and television has repeatedly been 
identified as people’s primary information source (Gunter, 1987; Levy and Robinson, 
1986; Towler, 2003). 

The endorsement of television as the main news source, however, has fluctuated 
from one year to the next and varies with the type of news. Among UK television 
viewers, for instance, television was named as the main source of world news by 66 per 
cent in 2001, by 70 per cent in 1991, by 52 per cent in 1981 and by 55 per cent in 1971. In 
those particular years, newspapers were endorsed as the main source, respectively, by 
15, 19, 34 and 40 per cent. Hence, television has grown in stature as the main news 
source, as perceived by viewers, and newspapers have declined (Towler, 2003). 

The positions of newspapers and television have generally been reversed in respect 
of being the main source of local news. Newspapers were named as the main local news 
source by 57 per cent in 1981, by 48 per cent in 1991, and by 47 per cent in 2001. 
Television was endorsed as such in each of these years, respectively, by 12 per cent, 25 
per cent and 28 per cent. In 1998, however, television temporarily overtook newspapers 
as the named main local news source for the first time (40 per cent versus 38 per cent). 
During the subsequent three years, newspapers grabbed back pole position again, but 
television surged ahead once more in 2002 (48 per cent versus 32 per cent) by a more 
significant margin than in 1998 (Towler, 2003). 

The importance of television news has been endorsed by audience research that has 
been conducted as part of Ofcom’s review of PSB and the BBC's review of its own 
governance linked to renewal of its Charter. A large survey of UK viewers undertaken by 
Ofcom (2004) found that 87 per cent of respondents felt that news and other information 
programmes that keep the population well-informed were important for terrestrial 
television. An overwhelming majority (85 per cent) also expressed satisfaction with 
news on terrestrial television channels (i.e. BBC1, BBC2, IT V1, Channel 4 and 5). Similar 
research by the BBC (2004) found that a significant majority of UK viewers (79 per cent) 
rated news and television as important to the country as a whole. 
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Television as a trusted news source 

There is consistent evidence, internationally, that people place a great deal of trust in 
television as an informational medium. For many years, television has been rated as 
the most objective news medium (Youman, 1972). It is also the medium people say they 
rely on first and foremost for their news (Stanley and Niemi, 1990). When faced with 
conflicting reports in different news media, far more people usually say they would 
believe the television version over newspapers (Lee, 1975; Rubin, 1983). When asked to 
judge different media independently for their reliability as news sources, most people 
regard television, radio and newspapers as reliable sources and the gap between the 
media narrows (Carter and Greenberg, 1965). 

In Britain, significant majorities of viewers have said that factual programmes on 
television are accurate all or most of the time. This opinion is widely held about news 
(89 per cent), current affairs (69 per cent), and dramatised reconstructions (68 per cent) 
and less widely held about documentaries (59 per cent) (Towler, 2003). The relative 
degree of trust invested in different news media, however, can depend upon the type of 
news and the nature of the evaluation. For example, when UK television viewers were 
asked which news sources are best for news of national and international significance, 
television was endorsed by between seven and eight out of ten respondents in terms of 
being “most complete”, “most accurate”, “most fair and unbiased”, “quickest”, and as 
offering the “clearest understanding”. Newspapers and radio trailed far behind. When 
asked to provide the same evaluations in the case of news of local and regional 
significance, television was rated highest by around one in two respondents, and the 
gap between television and newspapers was much narrowed (Gunter et al, 1994). 

It has also emerged that viewers make subtle distinctions about television on the 
basis of the “honesty” of factual programmes. Despite many viewers reporting that 
different types of factual programmes display accuracy, certain exemplars of these 
genres attracted less public confidence. In one UK-wide survey of viewers, respondents 
were asked how much they believed that three different types of factual programme 
presented honest portrayals of individuals and the situations they are in. The 
programme options included: fly-on-the-wall documentaries or docu-soaps such as 
Hotel or Airline; daytime talk shows such as Trisha or Jerry Springer, and shows about 
real people placed in new situations such as Castaway, Big Brother or Survivor. The 
number of people saying they thought such programmes were always honest were 
very small, ranging from 2 to 7 per cent of the sample. Around four in ten thought that 
docu-soaps were honest all or most of the time (42 per cent). Smaller proportions 
believed the same was true of real people in new situation shows (20 per cent) or 
daytime talk shows (9 per cent). Multi-channel TV viewers (47 per cent) were more 
likely than terrestrial-only viewers (37 per cent) to say that docu-soaps were honest at 
least most of the time (Towler, 2003). 


Trust in television institutions 

While television is trusted as a news medium by the majority of mass publics in 
developed democratic countries, not all television is trusted to the same degree. 
Differences have emerged between television channels in how impartial they are rated. 
In making these kinds of judgements, however, members of the public often confound 
their perceptions of broadcast institutions with their opinions about the content 
transmitted by those institutions. Morrison (1986) reported that only modest minorities 











of British viewers felt that the BBC would be critical of the government. The reasons 
for this were perceived primarily to lie with government sympathies believed to be held 
by the BBC and the fact that the BBC was dependent on government for its funding. 
When their attention was focused on programmes, though, few viewers were found to 
criticise the BBC on the grounds that its programmes were politically biased (Menneer, 
1989). When asked if they thought the news on BBC or ITV was biased, fewer than one 
in five national survey respondents accused BBC news of bias and only one in ten 
levelled the same accusation at the news on ITV (Menneer, 1989). Even among viewers 
who perceived that there was sometimes bias on these channels, a majority did not 
believe that this was generally true of news programmes on BBC and ITV (Bunker, 
1988). 

Not all public opinion research has revealed complete trust in television or in some 
of its primary news providers among viewers. Different ways of probing this issue 
have yielded varying profiles of response that provide conflicting evidence of how 
much people do trust television. For example, when in one survey, British viewers were 
asked if they thought television news was likely or unlikely to reflect a government 
position on an issue, most felt that this was likely rather than unlikely (Collins, 1984). 
This outcome was perceived to be likely regardless of which political party was in 
government. It was also perceived to be more strongly a characteristic of the BBC, 
which was seen as a more establishment-oriented organisation. 

Opinion data collected over time have confirmed the earlier observation of a 
perceived Conservative bias in the BBC while the latter were in government, but a more 
balanced view of ITV among UK viewers. In more recent times, however, a clear 
Labour bias perception materialised in relation to both these channels. It is important 
to note that these political preferences of broadcasting institutions were derived from 
minorities of viewers who perceived any bias at all. The great majority of UK viewers 
have tended to perceive no political bias among television institutions. For example, in 
1991, political party preferences were perceived by 29 per cent of UK viewers in the 
case of BBC1 and by 13 per cent in the case of ITV. In 2001, these percentages had 
dropped by significant margins to 14 per cent (BBC1) and 4 per cent (TV). Among 
those who perceived any bias on BBC1 in 1991, the split in terms of party preferred was 
63 per cent Conservative versus 36 per cent Labour. In 2001, the split was 33 per cent 
Conservative and 57 per cent Labour. For ITV, the 1991 split was 50:50 per cent, but by 
2001, it was 63 per cent Labour versus 31 per cent Conservative (Towler, 2002, 2003). 
There are three findings that stand out here. First, for the great majority of viewers, the 
major broadcast organisations are perceived to exhibit no particular political party 
preferences. Second, the extent to which any political bias is perceived has been 
diminishing over time. Third, among the minorities of viewers who do perceive any 
political bias on the part of major broadcast organisations, the direction of that bias 
tends to favour the incumbent government rather than the opposition party. 


A challenge to the trust in television 

Against this long history of public trust in television as a news medium, events in 2003 
and 2004 called into question the credibility, objectivity and professionalism of the 
UK’s major public broadcaster, the BBC. These.events surrounded a story that 
allegedly exposed the government for making misleading claims about the military 
capacity of the regime in Iraq to justify going to war. 
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At 6.07a.m. on May 29, 2003, BBC correspondent Andrew Gilligan filed a report on 
the Today Programme on Radio 4 in which he claimed that a Defence Ministry source 
had eccused the government of exaggerating claims about the danger posed to Britain 
and the rest of the world by Iraq in order to justify military invasion of that country 
and the toppling of its ruling dictator, Saddam Hussein. In a famous phrase, the source 
was reported to have claimed that the government “sexed up” its September 2002 
dossier on Iraq. In so doing, the government claimed that Iraq could launch weapons of 
mass destruction (by which were meant biological, chemical and nuclear weapons) 
within 45 minutes. The Prime Minister’s then Director of Communications, Alastair 
Campbell, was identified as playing a prominent role in the drafting of this statement, 
allegedly against the advice of the intelligence services who took a more cautious view 
of the immediacy of the extent of danger from Iraq’s weapons of mass destruction. 

An internal investigation was launched by the government to seek out the source. 
He was eventually identified as Dr David Kelly, a Ministry of Defence (MoD) expert on 
Iraq. Shortly after a Commons Select Committee enquiry at which he faced a public 
grilling over his part in the Gilligan story, Dr Kelly was discovered dead near his home, 
having apparently committed suicide. The Prime Minister immediately launched a 
public inquiry into the circumstances surrounding Dr Kelly’s death, led by Lord 
Hutton. The Hutton inquiry heard evidence from government and intelligence sources, 
from the BBC and from Dr Kelly’s family and other relevant sources. 

The Hutton inquiry found Gilligan’s allegation that the government “sexed up” its 
Iraq dossier to be false. Tony Blair and Alastair Campbell were cleared of any 
wrongdoing and Defence Minister Geoff Hoon got off lightly. The MoD was criticised 
for not telling Dr Kelly that his name could be confirmed or could come out as the 
source of Gilligan’s accusation. The inquiry found no evidence, however, of any 
underhand or dishonourable conduct on the part of the MoD. 

The BBC, in contrast, was found to be seriously culpable in several respects. 
Andrew Gilligan was criticised for poor journalistic practice and the BBC's 
management was castigated for failing to instigate the proper checks on the 
veracity of this report, given the seriousness of its allegations. Hutton’s criticisms of 
the BBC were so severe that the BBC’s Chairman, Gavyn Davies and Director General, 
Greg Dyke, resigned within two days of the findings being published on January 28, 
2004. 

Gilligan had claimed that the government had put pressure on the intelligence 
services to insert claims lacking in unequivocal evidence, that Saddam could launch 
weapons of mass destruction within 45 minutes. He subsequently wrote a newspaper 
article blaming Alastair Campbell personally for persuading intelligence chiefs to 
revise the dossier. Hutton called Gilligan’s accusations unfounded. Doubt was cast on 
Gilligan’s recollection of his conversation with Dr Kelly after losing his notes and 
typing accounts from memory into a computer. Indeed, Gilligan offered two versions of 
his notes during the inquiry that were not consistent in every detail. The BBC was 
accused of failing in its duty to the licence payers. Its editorial systems were found to 
be defective. It failed to investigate Gilligan’s claims properly. BBC senior management 
appeared not to treat seriously the demands from Downing Street that evidence for 
Gilligan’s allegations should be produced. Indeed, the BBC’s management failed to 
appreciate that Gilligan’s notes did not fully support the allegations. The BBC’s Board 
of Governors was also criticised for sloppy decision-making. According to Hutton, they 








failed to take appropriate steps by launching a more detailed investigation into 
whether Gilligan’s allegation was supported by his notes. Although not explicitly 
demanded, the broad tenor of Hutton’s remarks implied that changes might be 
considered in respect of the governance of the BBC. 

In a media backlash, the news media began to question in broader terms the 
message the government had put out about the seriousness of the dangers posed by 
Iraq. The Hutton inquiry was dismissed, by some critics, as a whitewash because it 
had so completely let the government off the hook. Gavyn Davies launched a 
counter-attack in which he questioned whether Hutton had taken enough notice of Dr 
Kelly’s taped conversation with another BBC journalist, Susan Watts, in which he cast 
doubt on the 45 minute claim. Greg Dyke, while also responding that not all aspects of 
Gilligan’s story were inaccurate, nevertheless conceded that the BBC's editorial 
systems had been defective on this occasion. Dyke then fired a parting salvo in which 
he accused the Hutton inquiry of being one-sided and the Prime Minister’s office of 
“systematic bullying” and “intimidation” of the BBC over its coverage of the Iraq war. 
He also argued that the Gilligan report, while having inadequate evidence to reinforce 
the “sexing up” claim, was nonetheless accurate in other respects. This intervention 
contrasted with the unreserved apology to the government issued by the BBC’s acting 
chairman, Lord Ryder. 

The alleged concerns that the Blair Government had about the quality of the BBC’s 
war coverage were not shared by the general public. While news consumption 
generally increased during the war on Iraq, and especially in respect of television, the 
BBC remained the source to which most people said they turned first. Preference for the 
BBC's news was strongest in households able to receive terrestrial channels only. 
Elsewhere, it emerged that the BBC’s news continued to be trusted by clear majorities 
of its own viewers to offer accurate (86 per cent), informative (89 per cent), balanced (74 
per cent) and interesting (71 per cent) war news coverage. It is interesting to note, 
however, that the BBC’s news did not command the highest regard of all television 
news broadcasters on these indicators. Indeed, opinions given by viewers of different 
television news broadcasters indicated that the best overall performer was Sky News 
which was most often regarded as accurate (89 per cent), informative (92 per cent) and 
balanced (81 per cent) and second most likely to be seen as interesting (84 per cent) 
after ITV news (86 per cent) (Sancho and Glover, 2003). Despite the plaudits received 
for its war coverage, did the events that followed the war cause serious damage to the 
BBC's reputation as a serious and trusted news provider among the public? 


Who did the public trust? 

Public opinion was surveyed repeatedly during the Hutton inquiry. Then a flurry of 
polls appeared at the time of publication of Hutton’s findings. Was there any evidence 
that the public’s trust in the BBC had been shaken by the events surrounding the Kelly 
affair? Might this have longer term effects on the public’s trust in television as a new 
source? 

A series of polls conducted by YouGov tracked public opinion about who was to 
blame for Dr Kelly’s death, while the Hutton inquiry was taking place. All these polls 
were carried out online, with weighted samples of over 2,000 respondents. In one poll 
(1-8 August; 2003), when asked who was to blame most for Dr Kelly’s death, more than 
four in ten respondents (41 per cent) said the government was to blame because it made 
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Dr Kelly’s name public and suggested that he was the BBU’s main source. This 
compared with one in three (32 per cent) who chose the Commons Select Committee for 
subjecting Dr Kelly to a public interrogation. Fewer than one in ten (9 per cent) felt that 
the BBC was to blame (YouGov, 2003a). 

One month later (4-5 September, 2004), a further national YouGov poll found that 
more blame was apportioned to the government than the BBC over Dr Kelly’s death. 
Clear majorities of respondents felt a great deal or a fair amount of blame should be 
assigned to Defence Minister Geoff Hoon (69 per cent), Tony Blair (62 per cent) and the 
Commons Select Committee (57 per cent). On this occasion, a significantly increased 
minority, compared with one month earlier, blamed the BBC as well (45 per cent) 
(YouGov, 2003b). In the same survey, respondents were also asked whether different 
parties had come out of the inquiry well or badly. Most respondents felt that Geoff 
Hoon (61 per cent), Alastair Campbell (59 per cent), and Tony Blair (56 per cent) had all 
come out of the inquiry badly. Over half the respondents (54 per cent) felt that the 
government as a whole had also come out badly. Far fewer held this opinion about the 
intelligence services (34 per cent). Indeed, the BBC (38 per cent) and more especially 
Andrew Gilligan (47 per cent) were more likely than the intelligence services to be 
perceived in a poor light (YouGov, 2003p). 

A few weeks before the Hutton findings were published, YouGov reported a further 
shift in public opinion towards the BBC (8-10 January, 2004). Two out of three 
respondents (66 per cent) in a national survey still blamed (a lot or a fair amount) Tony 
Blair for Dr Kelly’s death, an increase on earlier polls. However, there was also a 
marked increase in the prevalence (55 per cent) of blame attached to the BBC as well. 
Even more poignant, in view of the events that followed a few weeks later, was the 
finding that just over half of respondents (51 per cent) felt that the BBC’s Director 
General, Greg Dyke, should resign if the BBC was strongly criticized by the Hutton 
inquiry (YouGov, 2004a). 

Pricr to Hutton’s verdict being published, therefore, most people in the UK blamed 
the government for what happened to Dr Kelly. But over time, public opinion about the 
BBC gradually shifted and became more unfavourable. In one of the last polls (22-24 
January, 2004) conducted before the Hutton report was published, however, YouGov 
(2004b) invited respondents to make a direct comparison between various potential 
sources of blame for Dr Kelly’s death and to choose the source they held to be most to 
blame. In this context, more than three times as many respondents chose the 
government (35 per cent) as the BBC (11 per cent). Even more significant still was the 
finding that the greatest single proportion of respondents (39 per cent) felt that Dr 
Kelly himself was to blame. 

Upon publication of Hutton’s report and over the following days, a number of polls 
probed public opinion about the outcome of the inquiry and about events that followed 
it. Public opinion polls confirmed media accusations of a Hutton whitewash. While the 
BBC had been culpable in some degree for its own demise, the public did not believe 
that the government was totally innocent of any wrongdoing in relation to claims that 
the dangers posed by Saddam Hussein were imminent, and that a direct threat to 
Britain was real. 

This profile of public opinion was illustrated by the results of polls run by Populus 
for The Times on 28 and 29 January, 2004 (Riddell, 2004), by YouGov on 29 January, 
2004 (YouGov, 2004c), 30 and 31 January, 2004 (YouGov, 2004d, e), and, allowing a few 








days for the public to digest the arguments and accusations that raged on in the 
aftermath of Hutton’s findings, an online survey between February 2 and 10, 2004, by 
the British Life and Internet Project (2004)[1]. 

The findings from these polls can be considered together under a number of 
headings. Public opinion was probed about the Hutton inquiry and its outcome; the 
behaviour of different individuals and institutions that had been involved in the 
inquiry and the events it had investigated; the degree of public trust in individuals and 
institutions, and especially the perceptions of the BBC; implications for the future of the 
BBC; and finally trust in journalists and journalism. 


The Hutton inquiry 

There was widespread disagreement with the findings of the Hutton inquiry. YouGov 
(2004d) found that more than half the people surveyed just after publication of Hutton’s 
report (54 per cent) disagreed with its conclusion that the government did not “sex up” 
its September 2002 dossier on Iraq, with half as many (27 per cent) saying they agreed 
with this conclusion. When asked about the robustness of Hutton’s report, most 
respondents felt the enquiry had been a “whitewash” (55 per cent), with around one in 
four (26 per cent) saying Hutton’s conclusions were judicious and balanced. 

The British Life and Internet Project (2004) began by asking how much respondents 
believed the country can trust the result of the Hutton inquiry into the death of Dr 
David Kelly. Distrust in the inquiry outweighed the extent to which people trusted it. 
More than two thirds of respondents expressed distrust in the Hutton inquiry 
compared with three in ten (30 per cent) who felt it could be trusted. When asked 
whether they felt that the Hutton inquiry had reached fair and justified conclusions 
about the BBC, most (63 per cent) felt that it had not, with just over one in five (21 per 
cent) believing it had. 


Perceptions of different individuals and institutions 

Regardless of peoples’ opinions about the Hutton inquiry itself, there were important 
questions about the way different individuals and institutions had acted during the 
course of the inquiry. YouGov (2004c) asked respondents whether, in light of the 
Hutton report, different individuals or institutions had acted properly or improperly. 
None of the key parties involved, whether from government, the civil service or the 
BBC were regarded in a favourable light by the public. Among members of the 
government and their closest advisors, all were more widely believed to have acted 
improperly than properly during the inquiry. This was true of Tony Blair (52 per cent 
improperly versus 37 per cent properly), Alastair Campbell (61 per cent versus 25 per 
cent), and Geoff Hoon (60 per cent versus 24 per cent). It was also perceived to be true of 
Dr Kelly’s managers in the MoD (66 per cent improperly versus 10 per cent properly) 
and to a lesser degree of Sir John Scarlett, the chairman of the Joint Intelligence 
Committee (37 per cent versus 22 per cent). 

The same survey also found that the BBC was regarded poorly in this context. 
Many more people thought that improper than proper conduct had characterized the 
behaviour of the Corporation and its personnel. This finding was true for Greg Dyke, 
the Director General (61 per cent improper versus 25 per cent proper), the Board of 
Governors (62 per cent versus 22 per cent), and most especially Andrew Gilligan, the 
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journalist who broadcast the story about the “sexing up” of the Iraq dossier (72 per cent 
versus 17 per cent). 

The British Life and Internet Project (2004) also asked respondents to indicate their 
reactions to the behaviour of Tony Blair and Geoff Hoon, the Defence Secretary, 
following the death of Dr Kelly. They were asked, in each case, whether they thought 
these individuals had behaved honourably. A narrow majority (54 per cent) felt that Mr 
Blair had not behaved honourably (with 25 per cent saying he had done so) and a larger 
majority (61 per cent) believed the same of Mr Hoon (compared with 13 per cent saying 
he had done so). 

Respondents exhibited mixed opinions about the resignation of BBC Chairman 
Gavyn Davies following publication of the Hutton report. Between four and five in ten 
(45 per cent) felt Davies did the right thing, while nearly four in ten (38 per cent) did not. 
There was widespread support for BBC Director General Greg Dyke. Two in three 
respondents (67 per cent) thought that it was not justified that he should have lost his 
job following the findings of Hutton, with nearly one in five (19 per cent) saying it was 
justified. 

On the question of whether other senior heads should roll at the BBC, most 
respondents (64 per cent) believed that no other top management should lose their jobs 
over this matter, with less than one in five (19 per cent) who thought that there should 
be more job losses. In regard to Andrew Gilligan, who resigned the day after Dyke’s 
departure, survey respondents held mixed opinions. Similar percentages felt he should 
lose his job (40 per cent) or not do so (39 per cent). One in five (21 per cent) were 
undecided. 


Trust in individuals and institutions 

A principal focus of the post- -Hutton polls was how the inquiry and its AE TT had 
affected public trust in various institutions. In the context of the current discussion, 
public perceptions of the BBC are especially relevant. Had the BBC’s reputation as a 
primary news source, and one to which the public could turn at critical times and trust, 
been seriously undermined by Hutton? In examining the answer to this question, it is 
also important to consider how public opinion towards the government and the Prime 
Minister had been affected. 

There is little doubt that BBC bore the brunt of the criticism of Hutton. Had the 
outcome of Hutton radically altered the public’s image of the BBC and the Prime 
Minister? Riddell (2004) reported that among respondents to The Times/Populus poll, 
narrow majorities said that their feelings about Tony Blair (53 per cent) and the BBC 
(55 per cent) remained unchanged. However, among respondents whose feelings had 
changed, attitudes had more often become less favourable than more favourable both 
for Tony Blair (86 per cent versus 11 per cent) and the BBC (34 per cent versus 7 per 
cent). Most respondents to this survey (65 per cent) also believed that the Hutton 
inquiry would make a difference to the way the BBC reports news stories in the future. 
Far fewer (88 per cent) felt it would make any difference to the future behaviour of the 
government. 

In the aftermath of Hutton, however, the public were far more likely to trust the BBC 
to tell the truth (44 per cent) than the government (7 per cent). Significantly, though, 
nearly three in ten (29 per cent) said they would trust neither (29 per cent) (YouGov, 
2004d). 








Implications for the BBC 

Although Hutton did not make any explicit recommendations for the future 
governance or management of the BBC, there were implications in his conclusions that 
the Corporation’s management procedures as much as the quality of its journalism had 
been flawed on this occasion. This was recognized in statements made by Greg Dyke 
following his resignation and by the new acting chairman, Lord Ryder (Chapman and 
Conlan, 2004; Hughes ef al, 2004). Public opinion reflected a wider perception that a 
review of the BBC’s internal editorial and management practices was overdue and that 
there could be deeper implications for the form of future BBC governance. This 
observation should be qualified by saying that the public took a cautious view on these 
issues and there was no indication from any post-Hutton polls of public support for a 
radical overhaul of the BBC. 

The British Life and Internet Project (2004) asked its respondents about the 
implications for the future of the BBC. In the light of the controversy over BBC 
journalist Andrew Gilligan’s report that the government “sexed up” the evidence about 
the speed with which Iraq could deploy weapons of mass destruction, there was mixed 
opinion on whether a radical overhaul of its news editorial procedures and practices 
was necessary. Just over four in ten (46 per cent) believed this was necessary and 
somewhat fewer respondents (35 per cent) believed it was not necessary. 

It has been argued that the outcome of Hutton could have significant long-term 
implications for the BBC in respect of the trust in which it is held by the public and also 
in relation to the renewal of its Charter. Most respondents in this survey (74 per cent), 
however, believed that the BBC had not been irreparably damaged by the Hutton 
inquiry. Answers to a follow-up question went some way towards explaining why 
respondents thought this way. 

When asked whether they thought the events surrounding the Andrew Gilligan 
story about the government “sexing up” the evidence about Iraq’s alleged weapons of 
mass destruction could be seen as an unfortunate one-off error on the part of the BBC’s 
news management or was indicative of wider management incompetence, a majority 
(59 per cent) regarded the episode as a one-off blip, while a minority (24 per cent) 
thought it signalled wider problems. 

When asked which television station’s news was trusted the most these days, the 
BBC still came out on top. One in three (83 per cent) nominated BBC1, one in six (16 per 
cent) named BBC News 24, and a further eight per cent named BBC2. Thus, well over 
one in two respondents nominated a BBC TV news source. After the BBC, Channel 4 
emerged as the next most trusted television news source (18 per cent). Other UK TV 
news services were nominated much less often: Sky News (8 per cent), ITV1 (6 per 
cent), Five (1 per cent). Interestingly, CNN (7 per cent) received more nominations than 
some UK TV news sources (British Life and Internet Project, 2004). 

In consideration of whether there was a need now to consider a radical change to the 
way the BBC is regulated, a majority (69 per cent) replied “no”. Again, a minority (20 
per cent) believed there was a need for a regulatory overhaul. On probing further for 
levels of support concerning different future regulatory options, among those who 
supported regulatory change, 39 per cent felt the BBC should be regulated by Ofcom 
(the new broadcast and telecoms regulator), 25 per cent thought that the role of the BBC 
Board of Governors should be changed such that they regulate but do not get involved 
in internal management decisions of any kind, and 25 per cent felt it should be 
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regulated by a separate regulatory body, other than Ofcom. One in eight (12 per cent) 
had no firm opinion on this subject (British Life and Internet Project, 2004). These 
findings on future BBC governance were supported by YouGov (2004c) which found 
more widespread public support for maintaining the status quo with the BBC Board of 
Governors (55 per cent) than for transference of BBC regulation to Ofcom (38 per cent). 

Indeed, despite the shortcomings of Andrew Gilligan’s report, there was a wider 
view that the potential importance of the story was so great that it was right to raise 
the issue with the government by broadcasting the story about the dodgy dossier, 
When the British Life and Internet Project (2004) asked its respondents whether the 
BBC did the right thing or wrong thing in broadcasting the story, significantly more 
respondents felt the BBC had done the right thing (71 per cent) than the wrong thing 
(18 per cent) in this case. 


Implications for journalism 
The apologetic tone of the BBC’s new senior management post-Hutton led even some of 
the Corporation’s own journalists to argue for a more robust stand on behalf of the 
integrity and independence of the BBC’s journalism (Brooks and Dowell, 2004). A 
majority of respondents (70 per cent) to the YouGov (2004c) poll indicated that they 
shared concerns that the BBC might become too cautious in its news and current 
affairs coverage and too subject to behind-the-scenes pressure from government. 
The same survey probed levels of public trust in journalists and politicians and 
found that broadcast journalists from the BBC and ITV and broadsheet newspaper 
journalists were far more widely trusted than politicians. Clear majorities of 
respondents said they had a great deal or fair amount of trust in journalists from ITV 
(72 per cent), the BBC (67 per cent) and broadsheet newspapers such as The Times, The 
Guardian and The Daily Telegraph (60 per cent). Fewer endorsed this kind of trust in 
journalists from mid-market newspapers such as the Daily Express or Daily Mail (34 
per cent) or the “red-top” tabloids such as the Daily Mirror or The Sun (12 per cent). 
Trust in the Labour government (31 per cent) and in leading Conservative politicians 
(28 per cent) was less widespread than trust in mid-market tabloids but more prevalent 
than that in the “red-tops”. 


Conclusions . 
The Hutton inquiry found serious failings in the critical “sexing up” Today programme 
report by Andrew Gilligan on 29 May 2003, the subsequent reactions of the BBC’s 
management to government complaints about the veracity of the story and reliability 
of its source, and the effectiveness of the role played by the BBC's Board of Governors 
in the affair. The BBC was thrown into disarray, its Chairman and Director General 
resigned, and searching questions were asked about its governance. There were wider 
debates about the damage the findings of Hutton had wrought on the public’s 
perception of the Corporation and trust in journalism. This paper has focused on the 
last two issues and reviewed the findings of a series of opinion polls to shed light on the 
public’s views about the BBC, the government, the Hutton inquiry, and journalism. 
The evidence from these polling sources is that the reputations of neither the BBC 
nor journalists in general have been profoundly tarnished by the Hutton inquiry and 
its conclusions. The public had less than complete confidence in the independence of 
the inquiry itself and in the conclusions it reached. When invited to attribute blame for 








the death of Dr Kelly, opinions changed over time. The government was consistently 
regarded as more to blame than any other source and was certainly more widely 
admonished than the BBC. As the inquiry progressed and the date of publication of 
its findings drew nearer, however, more of the public began to question the 
culpability of the BBC. Furthermore, there was recognition amongst a majority of 
the public that senior heads might have to roll at the BBC should Hutton find against 
the Corporation. 

Once Hutton’s report was published, there was clear evidence that the public had 
been disappointed in the BBC’s internal processes and the decision-making by its 
management and the Board of Governors. Most criticism was directed at Andrew 
Gilligan. But a majority of people thought that the BBC’s Governors and its Director 
General, as well as Gilligan, had acted improperly (YouGov, 2004c). In fact, in terms of 
proper conduct, the Corporation fared worse than did the government in the public’s 
perceptions. There was a further indication that while most people’s opinions about the 
BBC had not changed, more reported feeling less positive than more positive about the 
Corporation (Riddell, 2004). 

It is likely though, that public feelings about the BBC were temporarily affected 
rather than permanently changed. People were still more than six times more likely to 
say they trusted the BBC than that they trusted the government (YouGov, 2004d). 
Moreover, most people felt that there had been no permanent damage done to the BBC 
by Hutton (British Life and Internet Project, 2004}. Most believed that the Gilligan 
affair had been a one-off aberration, while only a minority felt it was indicative of wider 
problems. There was no widespread support for an overhaul of the BBC in terms of the 
way it is run (British Life and Internet Project, 2004). 

Trust in journalists — whether inside or outside the BBC — showed no sign of 
damage either. In the aftermath of Hutton, most people still exhibited widespread trust 
in the major television news providers (TV and the BBC) as well as in the journalism 
of serious broadsheet newspapers (The Guardian, The Times, etc). Mid-market and 
“red-top” tabloid journalists in contrast were much less widely trusted, but this seems 
unlikely to have been caused by Hutton. There is, instead, an apparent “branding” of 
journalism and journalists in terms of the specific news media organizations they work 
for. News media are in part defined and differentiated in terms of the trustworthiness 
the public are prepared to assign to their typical journalistic outputs. This “trust 
brand” conditioning occurs over a long period and derives from experience. It leads to 
the establishment of reputation constructs that become attached to specific news media 
outlets that define expectations about the quality of their journalism. Once established, 
these constructs can be difficult to shift. 


Note 


1. The Populus poll was conducted by telephone with 500 adults aged 18 + across the UK, 
with the data weighted to be representative of all adults. All YouGov polls were conducted 
online with samples of 2,312 (YouGov, 2004c), 2,004 (YouGov, 2004d) and 1,815 (YouGov, 
2004e), again weighted to be representative of the general UK population. The British Life 
and Internet Project survey was conducted online with 2,890 respondents by 
eDigitalResearch.com, weighted to be representative of the general UK population. 
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Abstract 


Purpose — To construct web visibility profiles of news web sites by examining hyperlinks pointing 
to the sites. 

Design/methodology/approach — National newspapers from USA (USA Today), Canada (The 
Globe and Mail), China (People’s Daily) as well as Hong Kong (Sing Tao Daily) were selected for the 
study. A total of 1,859 links pointing to the four news sites were manually classified into the four 
aspects of language, country, types of sites, and reasons or purposes for linking. 

Findings — A comparison of the four news sites provided useful information on their web visibility. 
The Globe and Mail seemed to have a larger international reach than USA Today. Neither newspaper 
web site attracted links from China or from pages in the Chinese language. Outside China, People’s 
Daily, an official Chinese Government newspaper, is not as visible as Hong Kong based Sing Tao Daily. 
USA Today and The Globe and Mail were used more for news citing or reprinting purposes while 
People’s Daily seemed to be used more as a research resource. 

Research limitations/implications — Link analysis like this provides us with only an indirect 
view of the online readership and the methodology has limitations, Not all readers create links to the 
newspaper sites that they visit. Readers could be led to a news site through other venues including 
“social bookmarking” services. 

Practical implications ~ The study shows that link analysis is a novel and useful method that 
journalists and information professionals can use to gauge online readership and potential impact of 
news sites. : 

Originality/value — Presented a novel method that complements but not replaces other web user 
studies such as web server log analysis. 


Keywords Information media, Newspapers, Worldwide web 
Paper type Research paper 


‘Introduction and related research 


The web has become a hot news medium. News agencies around the world, including 
newspapers, radios and TV stations, have established their web presence to attract 
more audience. Compared with their traditional counterparts, news web sites have 
remarkable advantages including immediacy and virtually unlimited space. An even 
more important advantage is the hyperlinks to the news web site which are potential 
doorways leading readers into the site. An increasing number of people have opted for 
this medium to retrieve needed news or other information (Williams and Nicholas, 
1999). Online newspapers have been characterized as the media of the future 





(Thiel, 1998) and the number of newspaper web sites around the world has doubled 
since 1999 (Feuilherade, 2004). 

How effective is this new medium in disseminating information? Who uses them 
and how do they use them? These questions are important for both journalists and 
information professionals. The current study attempts to address these questions by 
analyzing web hyperlinks pointing to newspaper web sites (i.e. inlinks to news sites). It 
complements previous studies that used other research methods such as surveys 
(AlShehri and Gunter, 2002) and web server log analysis (Nicholas et al, 2000). 
Specifically, the study analyzed hyperlinks to the web sites of four major newspapers 
from around the world. A comparison of the four newspapers revealed useful 
information on their online visibility, reach of readers and potential impact. 

The four chosen newspapers are USA Today, The Globe and Mail (Canada), People’s 
Daily (mainland China), and Sing Tao Daily (Hong Kong, China’s Special 
Administrative Region). The first three newspapers are national newspapers in the 
three respective countries while Sing Tao Daily is one of the most reputable 
newspapers of Hong Kong. USA Today and The Globe and Mail were chosen to 
contrast to the two newspapers from China. The two Chinese newspapers form a very 
- useful contrast as well because they have a very different history, tradition, and 
missions. USA Today and The Globe and Mail are in English while People’s Daily and 
Sing Tao Daily are in Chinese. The web site of People’s Daily also has an English 
version but the Chinese version of the web site was used for the purpose of the study. 

The web is changing rules of news publishing and dissemination, which calls for a 
radical re-thinking of the existing practice and more research in this new area (Wu and 
Bechtel, 2002). Gilbert (2002) found very low readership overlap between online 
newspapers and their print counterparts. The finding highlights the significance of 
studying the new online media. Williams and Nicholas (1999) surveyed USA and UK 
online news sites and found that American newspapers were exploiting the advantages 
of web information very well. Massey and Levy (1999) analyzed 44 English-language 
online newspapers from 14 Asian countries and found that they focused on news 
content but did not take advantages of immediacy, inter-sites hyperlinking, or 
communication between readers and the newspapers that the web technology 
provides. Nicholas et al (1999a, b, 2000) did a series of research on the online 
readership of the The Times and The Sunday Times. Their data source is the server 


logs of these web sites. While web server logs provide rich information on how people - 


use these web sites, they have limitations in determining user identities due to the 
anonymous nature of web users in most cases (Thelwall et al, 2005; Nicholas et al, 
2003). Web server logs are not publicly available, which limits their use for research 
purposes. Web hyperlink as a data source, the method used in this study, complements 
the web server analysis method. Links to a web site can be found by searching publicly 
available search engines such as Yahool. 

It is fruitful to study links to news web sites because links are purveyors of trust in 
their targets and could be used as a collective source of new information on the web 
(Cronin, 2001; Davenport and Cronin, 2000). Numerous empirical studies have found 
that the number of links to web sites can be an indicator of the caliber of the 
organization hosting the web site. Specifically, the number of links to an academic web 
site correlates with the university research output (Thelwall, 2001) while the count of 
links to a commercial site correlates with the company’s business performance 
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measures (Vaughan, 2004). Very few studies of news web sites, however, have chosen 
web hyperlinks as the object of the analysis. Among the few is the Tremayne (2004) 
study that analyzed the use of hyperlinks in news stories on the web. He examined 
1,500 web news stories over a five-year period, looking specifically at the hyperlinks 
embedded in these news stories. He found that the use of links in web news stories is 
increasing in ways predicted by the network theory. In contrast to his study which 
examined the quantity of links, our study examines the nature of links. Another 
difference is that his study dealt with outlinks (links going out from the news web site 
to other sites) while our study investigates inlinks (links coming into the news web site 
from other sites). 


Methodology 

The URLs for the four newspapers in the study are www.usatoday.com (USA Today) 
www.globeandmail.com (The Globe and Mail), www.peopledaily.com.cn (People’s 
Daily), and www.singtao.com (Sing Tao Daily). Yahoo! (www.yahoo.com) was used to 
search for inlinks (also called links later in the paper) that point to these web sites. 
There are two types of inlinks, total inlinks and external inlinks. Total inlinks include 
all links pointing to a particular site while external inlinks include only links coming 
from web sites outside the site in question. In other words, external inlinks do not 
include links within the site itself, such as the “back to home” type of navigational links 
within the site. Our study only examines external links because internal links are not 
indicators of online visibility or impact. Earlier webometrics studies (Thelwall, 2001; 
Vaughan and Thelwall, 2003) did not use Google for data collection as it could not 
perform external link searches. Although external link search commands can be 
entered at Google at the time of writing (June 2005), extensive testing of the search 
results showed that the results are simply wrong. For example, in a search on 6 June 
2005, Google reported only 94 external links to USA Today homepage while Yahoo! 
found over three million external links to the page. Further exploration of Google 
documentation confirmed that indeed Google cannot performance external link 
searches at this time. In Google API documentation, it states that “no other query terms 
can be specified when using this special query term”, i.e. the “link” query term (Google, 
2005). 

Two other search engines that have been used for web link studies in the past are 
AltaVista and AllTheWeb. Both were acquired by Yahoo! in early 2004 and their 
databases became a subset of Yahoo! Therefore, Yahoo! was preferred over these two 
engines at the time of data collection (fall 2004). Another search engine that could do 
external link searches at the time of the data collection was MSN. However, it retrieved 
a much small number of hits than Yahoo! did. Even after MSN launched its new 
version on 1 February 2005 (Microsoft, 2005), it still retrieved fewer external link 
search hits than Yahoo! did in June 2005 and it could only display the first 250 search 
results while Yahoo! could display the first 1,000 hits. All these factors led to the choice 
of Yahoo! for data collection. It is recognized that different search engines index 
different web sites and that there are advantages of using multiple search engines for 
data collection (Thelwall, 2004). However, this study is different from other 
quantitative link analyses where the number of links is a main measurement. The 
purpose of the study was not to find every single link to the four web sites or to 
compare the absolute number of links of the four sites. The study needed only a sample 











of links to be manually examined to construct web link profiles of the four sites and to 
make relative comparisons among them. Thus, using a single search engine for data 
collection is not likely to bias the results significantly. 

The query used to search for external inlinks in Yahoo! (using the URL of USA 
Today as an example) was “link: www.usatoday.com — site: usatoday.com”. It should 
be noted that this Yahoo! query only retrieves links that point to the homepage of the 
site. Links that point to other pages of the site are not included. Another Yahoo! 
command “linkdomain” will retrieve all links pointing to the whole site, not just the 
homepage. However, an exploration showed that the majority of links point to the 
homepage of a site and all four sites in the study received more than 10,000 links to 
the homepage alone. It was decided to use the “link” command rather than the 
“linkdomain” so that the comparisons of the four sites were all made on the links to 
homepage level. Data collection took place during the first week of November 2004. 

The number of links to the four web sites retrieved from Yahoo! is shown in the 
middle column of Table I. It would be extremely time consuming and unnecessary to 
examine all these millions of links. Besides, Yahoo! will only display the first 1,000 hits 
of a search result. A systematic sampling method was used to select every second 
retrieved link that Yahoo! can display. This means that 500 links to each newspaper’s 
web site were examined. URLs of these linking pages (pages that initiated the links) 
were copied into an Excel file for further data cleanse as follows. 

Not all links retrieved by a Yahoo! search could be examined. Some were dead links 
while others were outdated links (i.e. the retrieved page originally had a link to the site 
under study but the content of the page had been changed since Yahoo! indexed it and 
the current page had no link to the site under study). These invalid links were removed 
from the study. There were also duplicate URLs in Yahoo! search results, i.e. the same 
page showed up more than once in the same search result. These duplicate pages 
were identified using Excel’s “find” function and then removed one by one. The 
number of links left to be studied for each newspaper is shown in the right column of 
Table L 

A total of 1,859 links to the four newspaper web sites were examined. The content of 
the linking page as well as the context of the link was analyzed. Each linking page was 
then classified into the following four aspects: 


(1) ‘enguage (English, Chinese, and others); 
(2) country (USA, Canada, mainland China, Hong Kong, and others); 


(3) type of site (e.g. corporate site, government site, etc. see Appendix 1 for details of 
the categories); and 


Newspaper Number of links found by Yahoo! Number of links used for study 





USA Today 2,130,000 475 
The Globe and Mail 363,000 459 
People’s Daily - 316,000 455 
Sing Tao Daily 11,000 470 
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Table II. 

Language of pages that 
linked to the four 
newspaper web sites 


(4) reasons or purposes for linking (why was the link to the newspaper web site 
created, e.g. to cite or reprint a news item see Appendix 2 for details of the 
categories). 


In the above classification, “country” was determined by the location of the individual 
or organization that was responsible for the content of the page, rather than by the 
physical location of the web server on which the page was stored. This is consistent 
with OCLC (2002) method of country classification. “Type of site” was determined by 
the content of the site, rather than by the domain name of.org or.edu, etc. Categories for 
“reasons or purposes for linking” were developed through an induction process based 
on grounded theory (Berg, 1995, pp. 179-81). It used latent content analysis as opposed 
to manifest content analysis approach (Babbie, 2001, p. 310; Berg, 1995, p. 176). A 
group of linking pages were examined first to identify potential categories. These 
categories were then applied to classify another group of pages to fine-tune the 
categories, The final categories used to classify reasons or purposes for linking were: to 
list news media, to cite or reprint news items, to show partnership, and others. See 
Appendix 2 for examples of these categories. 


Results 

It is clear from Table I data that the web site of USA Today received far more links 
than others in the study. The Globe and Mail and People’s Daily were very close in link 
counts while Sing Tao Daily is a distant fourth. The link count alone only provides a 
picture of web visibility in absolute terms while the classification results reported 
below provide a much richer description of the online visibility and potential impact of 
the four web sites. The results are organized by the four variables of the classification. 


Language 

Table II shows the language breakdown of pages linking to the four newspaper web 
sites. The most striking figure is that 83 percent of pages are in English despite the fact 
that two newspapers in the study are in Chinese. This reflects the dominance of the 
English language on the web that is reported in earlier studies (OCLC, 2002). It is also 
worth noting that none of the pages linking to USA Today and The Globe and Mail are 
in Chinese despite the fact that a significant portion of the web is in Chinese and the 
size of the web user population of China is second only to the USA. (Chua, 2004). It is 
important to point out that data in Table II should not be interpreted in absolute terms 
but rather used only in relative comparisons as they are collected from a commercial 
search engine Yahoo! No search engine indexes every web site in existence and each 
engine has its own crawling or coverage policy. Vaughan and Thelwall (2004) 











Newspaper English Chinese Others 
No. % No. % No. % Total 
USA Today 471 99.2 0 0 4 08 475 
The Globe and Mail 454 98.9 0 0 5 11 459 
Peopie’s Daily 189 415 259 56.9 7 15 455 
Sing Tao Daily 434 92.3 35 74 1 0.2 470 
7 09 1,859 


Totat 1,548 83.3 294 15.8 1 





demonstrated that some major search engines under-represented web sites from China. 
This might partially explain the low percentage of Chinese pages in Table II. However, 
a relative comparison of.the two Chinese newspapers can still be made. Table II shows 
a strong contrast: the web site of People’s Daily received a much smaller portion of links 
from English web pages than that of Sing Tao Daily (41.5 percent vs 92.3 percent). This 
suggests that the former is much less visible in the English world than the latter. This 
finding is also confirmed later in the analysis by country. 


Country 
Table II] shows the country breakdown of pages linking to the four newspaper web sites. 
. An examination of the percentage figures in the Table suggests that there is a 
relationship between the country of the newspaper and the country where the links were 
coming from. For example, the majority of pages linking to USA Today and The Globe 
and Mail were from USA and Canada, respectively. A chi-square test confirmed that the 
relationship is statistically significant (p < 0.01). This is not surprising in light of 
findings from other studies which showed that web pages tend to link to other pages in 
the same country (Thelwall, 2002). However, it is very interesting and somewhat 
surprising that the two Western newspapers in the study have a quite different 
international reach. Over 90 percent of links to USA Today were from the USA while 
only 52.7 percent of links to The Globe and Mail were from Canada. The Globe and Mail 
received about 35 percent of links from the USA while USA Today had no link from 
Canada. The Canadian newspaper also received relatively more links from other 
countries than its USA counterpart. Although an earlier study (Vaughan and Thelwall, 
2004) found a search engine bias toward USA sites vs Chinese sites, there is no study that 
showed bias toward USA sites against Canadian sites. It is very unlikely that the very 
large discrepancy between the two newspapers reported above can be totally attributed 
to the over representation of USA pages in search engines. It is thus likely that the 
Canadian newspaper had a larger international reach than its American counterpart. 
A parallel contrast exists between the two Chinese newspapers. People’s Daily was 
much less visible internationally than Sing Tao Daily as measured by the links coming 
from other countries. The two Western newspapers had no links from China which 
confirms earlier findings in the language analysis. However, the two Chinese 
newspapers both attracted a large percentage of links from North America, probably 
due to the large number of Chinese people living in North America. A very remarkable 
phenomenon is the extremely low percentage of links (0.2 percent) from mainland 
China to the Hong Kong newspaper Sing Tao Daily. The same is true of links from 
Hong Kong to People’s Daily (0.4 percent). 


Type of sites 

Web pages linking to the news web sites under study were classified into various types 
as discussed earlier in the Methodology section. The classification results are 
summarized in Table IV. Not surprisingly, the most common type of site is “news 
sites” (25.4 percent), followed by “commercial portals” (20.1 percent). It is somewhat 
surprising that “personal sites” (17.9 percent) is more common than sites run by 
academic portals (8 percent). Originally, the classification scheme had “educational 
site” as a category. However, there were very few of this type of site so they were 
merged into the “other” category in Table IV. 
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The four newspapers are significantly different in the types of sites from which they 
attracted links (chi-square test, p < 0.01). As shown by the percentage figures in 
Table IV, both USA Today and The Globe and Mail attracted more links from news 
sites (46.5 percent and 32 percent, respectively), suggesting that they were used mainly 
as news sources. The Globe and Mail was also very popular for personal weblogs of 
news (35.9 percent) to link to, again as a news source. On the other hand, People’s Daily 
attracted relatively fewer links from news sites but more links from portal sites (38.7 
percent commercial portals and 12.3 percent academic portals). Being an official 
government newspaper, it also attracted more links from government sites. In contrast, 
Sing Tao Daily received no link from government sites and the most common type of 
sites that link to it was personal sites, which reflected its status of a non-official 
newspaper in China. 


Reasons or purposes for linking : 

Reasons or purposes for linking to the four newspaper sites are summarized in Table V. 
When the four newspapers are combined, the most common reason for linking is to “to 
list news media” (77.1 percent), followed by “to cite or reprint news items” (12.5 
percent). When the four newspapers are examined separately, reasons or purposes for 
linking to them are significantly different (chi-square test, p < 0.01). The most 
common reason (92 percent) for linking to the two Chinese newspapers was to list news 
media, which is usually a simple list of news media rather than citing specific news 
items. Only about 5 percent of links to these two sites were to cite or reprint their news. 
In contrast, the two Western newspapers were linked relatively more frequently for 
news citing or reprinting purposes (14.1 percent for USA Today and 26.4 percent for 
The Globe and Mail). Further, there is a difference between USA Today and The Globe 
and Mail. The former received more links from its partners (“to show partnership” 
category), which could be explained by the fact that many local American newspapers 
are partners of USA Today. 


Discussion and conclusions 
The analysis of the links to the four newspaper web sites draws a profile of their web 
visibility and impact. A common feature is that links tend to come from the 
newspaper's own country. However, different newspapers had very different 
international reach as measured by the proportion of links coming from outside the 
country. The Canadian newspaper The Globe and Mail seemed to have more 
international visibility than its American counterpart USA Today, as shown by the 
percentage of links coming from outside the country (47.3 percent vs 7.8 percent). A 
parallel contrast existed between the two Chinese newspapers with People’s Daily 
being less visible internationally than Sing Tao Daily (47 percent vs 84 percent of links 
coming from outside China). This is ironic as one of the missions of People’s Daily is to 
serve as a window to China for the world. People’s Daily claims that it is the “first class 
distribution center of information about China” (People’s Daily Online, 2005). Although 
the newspaper succeeded in attracting a significant portion of the links (47 percent) 
from outside China, it was not as internationally visible as Sing Tao Daily, a Hong 
Kong based regional newspaper. 

The analysis of the language of pages that linked to the newspaper web sites 
confirms the above findings. Sing Tao Daily had proportionally more links from the 
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non-Chinese world than People’s Daily had. The two Western newspapers attracted no 
link from pages in the Chinese language or pages originating from China. The contrast 
between these two newspapers could be explained by the fact that Sing Tao Daily is a 
free press while People’s Daily is controlled by the Chinese Government. 

The type of sites that linked to the newspaper sites reflects the nature of the 
newspapers very well. People’s Daily, an official Chinese Government newspaper, 
received more links from government sites while Sing Tao Daily had more links 
coming form personal sites. The most common type of sites that linked to USA Today 
and The Globe and Mail were news related sites (personal weblog of news and news 
sites), suggesting that they were used mainly as news sources. In contrast, People’s 
Daily was under-linked by news sites, which showed that it was not used mainly as a 
news source. It received proportionally more links from academic portal sites, 
suggesting that it was used more as a research resource. 

Examining reasons or purposes for linking to the newspaper web sites provided us 
with a deeper understanding on why links were made. Links to USA Today and The 
Globe and Mail fell proportionally more into the category of “to cite or reprint news 
items” while that to the two Chinese newspapers were more in the category of “to list 
news media”. This means that news items in the two Western newspapers were cited or 
reprinted more than the two Chinese newspapers. This could be interpreted to mean 
that the two Western newspapers were perceived to be more authoritative news sources. 

The study constructed newspaper web visibility profiles through analyzing links to 
the web sites. A comparison of the four newspapers provided useful information on 
their international reach and potential impact. It should be noted, however, that the 
study is very limited in scope as only one newspaper from each country/region was 
examined. We do not suggest that these four news sites represent other news sites from 
those countries. Newspapers in the study were chosen because they were national or 
major newspapers but they do not represent all newspapers from that country. It is an 
exploratory study to find out if link analysis can help to construct web profiles of these 
sites. Larger scale studies are needed to test if the conclusions from the study can be 
generalized to other newspapers from the same countries. 

The study shows that link analysis is a novel and useful method that journalists 
and information professionals can use to gauge online readership and potential impact 
of news sites. It should be noted that link analysis like this provides us with only an 
indirect view of the online readership and the methodology has its limitations. Not all 
readers create links to the newspaper sites that they visit. Readers could be led to a 
news site through other venues including “social bookmarking” services. However, 
links are very important doorways that could lead readers to the site. Link analysis 
thus allows us to estimate the potential readership. This kind of analysis complements 
but does not replace other web user studies such as web server log analysis. Web 
server log data are a more direct measure of the use of the site but user identities are 
mostly unknown there. Results from this qualitative study of link classification 
confirm conclusions from early quantitative studies which demonstrated that web 
hyperlinks contain useful information and can be objects of web data mining. 
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Appendix 1. Types of web sites 

Corporate web sites run by companies 

Example: a link from www.cp-lumber.com to www.globeandmail.com. The former is a corporate 
web site run by a company providing web services. 


Non-profit organization web sites 
Example: a link from www.sa-chinese.org to www.peopledaily.com.cn. The former is a non-profit 
organization’s web site. 


Government web sites 
Example: a link from www.metrotown.info/canada-vancouver to www.globeandmail.com. The 
former is a Canadian Government site. 


Personal weblog of news 
Example: a link from www.punditmark.com to www.usatoday.com. The former is a personal 
weblog of news. 


Personal web sites (web sites maintained by individuals and the site content is of personal interests) 
Example: a link comes home.att.net/~ C.CJordan to www.usatoday.com. The former is a 
personal site. 


News web sites (web site’s main content is news report) 
Example: a link from www.delmarvanow.com/ to www.usatoday.com. The former is a news site. 


Academic portals (directories or collections of links for research, reference, or educational purpose) 
Example: a link from www2.cddc.vt.edu/polisci/index.php3?catid = 49 to www.peopledaily.com. 
cn. The former is an academic directory. 





Commercial portals such as Yahoo! 
Example: a link from www.ceoexpress.com/ to www.usatoday.com. The former is a commercial 
web portal. 


Others 
Example: a link from www.judges.org to www.usatoday. The former is a US judicial educational 
web site. 


Appendix 2. Reasons or purposes for linking 

To list news media 

Example: a link from www.politicalresources.net/hong-kong4.htm to www.singtao.com (the Sing 
Tao Daily). The former is an academic portal site that lists news media and other online 
resources. 


To cite or reprint news items 
Example: a link from www.london.planettoday.com to www.globeandmail.com (The Globe and 
Mail). The former is a news site that reprints news from The Globe and Mail. 


To show partnership 
Example: a link from www.burlingtonfreepress.com/ to ‘www.usatoday.com (USA Today). 
Burlington Free Press is a partner of USA Today. 


Others 
Example: a link from www.hrmjobs.com/jobads.html to www.usatoday.com (USA Today). 
The former is an online recruitment site and it lists the latter as one of its clients. 
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Abstract 
Purpose — The aim of this paper is to reveal an information dynamic in which technology or dead 
labour is substituted for living mental labour. 


Design/methodology/approach — A comparative historical review of technologies for reproducing 
written utterances and their relation to living labour. 


Findings — A dynamic for mental labour, similar to that for physica! labour, is isolated. 


Research limitations/implications — The productivity paradox, a central concern of the 
information systems literature, is dissolved. 


Originality/value — The paper is relevant to both information science and information systems. 
Understanding an information dynamic can enable intervention in that dynamic. 

Keywords Labour, Information, Knowledge economy 
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Introduction 

A fundamental dynamic has been detected in capitalism. Innovation in techniques of 
production is constantly pursued in order to reduce unit costs of production and to 
obtain a greater than average rate of profit growth (Mandel, 1981, p. 20). 
Characteristically, although not necessarily uniformly, reduction of costs is achieved 
by innovations which transfer components of production from direct human labour to 
machine processes, reducing the amount paid in wages for direct human work and 


- increasing fixed capital. Machinery is itself regarded as the cumulative product of 
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human labour over time (Marx, 1973, p. 706). For Marx, the “basic logic of the capitalist 
mode of production ... [was] expansion, growth, enlarged reproduction, through a 
substitution of living by dead labour” (Mandel, 1981, p. 13). Classically, labour has 
been conceived as physical labour either on “objects of labour spontaneously provided 
by nature” or on those objects filtered through previous labour (Marx, 1976, pp. 284-5). 
Mental labour, particularly as science (Rosenberg, 1974), has been recognised as 
significant to enhancing the scope of human productive powers, but has less often been 
isolated and studied as a form of labour in itself (Warner, 2005). A similar dynamic, of a 
search for innovation and a substitution of dead for living labour, could be detected 
with regard to mental labour and its transfer to information and communication 
technologies (ICTs). 

Discovering an information dynamic would be relevant to both information science, 
and information systems, disciplines with some common concerns, but historically 
divided (Ellis et al, 1999). Attention to the reproduction of written utterances has 





affinities with information science, particularly with its interest in documentary 
communication. Innovation and productivity are more prominent concerns of the 
information systems literature, although there has been attention to labour and its 
costs, for such activities as cataloguing, within information science (Hayes, 2000; 
Warner, 2005). 

Establishing the existence of an information dynamic could explain and dissolve 
major concerns of the information systems literature, with the impact of technologies, 
with the productivity paradox, and with the lack of connection between micro- and 
macro-levels of consideration. A debate has existed as to whether ICTs have impacts 
by themselves or only through their use, although use rather than inert presence is 
being increasingly favoured as the determining factor (Orlikowski, 2000). If technology 
is recognised as a radical human construction, in a deeper sense than for the social 
construction of technology, as “organs of the human brain, created by the human hand, 
the power of knowledge objectified” (Marx, 1973, p. 706), both made and revivified by 
human activity, then the investment and reinvestment of human labour in its use is 
crucial to it having effects. Information technology, as contrasted with productive 
technology, is concerned with the transformation of signals from one form or medium 
into another rather than of natural resources into useful goods and services (Warner, 
2004, pp. 5-35) 

The productivity paradox itself has been understood with two emphases: as concern 
at the discrepancy between the expectations of the transformative power of ICTs and 
their limited effects on introduction to practice; and, more broadly, as the lack of 
correlation between investment in ICTs and commercial success, at both the micro- or 
find and macro- or collective economy level. Productivity is initially conceived, in 
- accord with classic antecedents in political economy, as the amount produced per unit 
of input, and then, more strongly, as use- or exchange-values produced, or simply as 
commercial success. Difficulties of measurement of output for information goods and 
services are acknowledged, particularly when output is understood as value for 
consumers (Brynjolfsson, 1993; Brynjolfsson and Hitt, 1998). More recent work, 
developing from the late 1990s, has recognised the contribution of ICTs to commercial 


success, although as one among other commodity inputs rather than as a. 
transformative and differentiating force for individual firms (Carr, 2004). — 
Information, and then communication, technologies are considered by the 


information systems literature, corresponding to the chronology of the diffusion of 


the computer and modern message transmission networks, epitomised by the internet. , 


Control mechanisms in industrial machinery receive less direct (Brynjolfsson, 1993, p. 
71), or separate (Beniger, 1986), attention, possibly reflecting a divide in consciousness 
rather than in practice. The time frame of studies is characteristically restricted, not 
allowing transition costs or the risks of innovation to be separated out from equilibria 
obtained, and the failure to discover positive effects of the introduction of ICTs could 
be an artefact of this restriction. The productivity paradox could be received as an 
empirical observation rather than as a fully developed theory, which would attend to 
the motivating forces which generated the phenomena observed. Studies give little 
evidence of close attention to the relation of information technology to direct human 
labour, either diachronically or synchronically. A need to reconsider the way in which 
productivity and output is measured has been recognised (Brynjolfsson, 1993, p. 76). 
The classic conceptions of information quantity developed from information theory 
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(Shannon, 1948), and recently exploited as measures of world production of 
information (How much information, 2003), could be used to inform a strict, and 
relatively unambiguous, understanding of productivity as the amount of information 
produced per unit of human input. 

Micro-level studies which concentrate on the process of innovation, and 
macro-level considerations of the results of innovation are recognised to be poorly 
integrated with each other, with concerns divided between the disciplines of 
information systems and economics (Avgerou and La Rovere, 2003). Classic 
distinctions between invention, innovation, and diffusion, widely diffused within 
economics, could be used to link micro- and macro-levels. Invention is understood 
as the process of transforming an idea for a technology into a demonstration of 
technical feasibility and also as the product demonstrated; innovation as the more 
robust commercial product developed from the innovation and as the entry of the 
product into the economic system; and diffusion as the spread of the innovation. 
Conceptualisation is also occasionally distinguished and is understood both as the 
idea for a process or technology, and, more rigorously, as a more formalised model 
for that process (Warner, 2004, pp. 5-35). Invention and innovation are micro-level 
products and processes and diffusion the collective or macro-result, produced by 
the adoption of innovations. 

Human labour must be understood to include mental labour. Semantic labour, 
concerned with the meaning of symbols, has been differentiated from syntactic 
labour, operating exclusively on the form of symbols. Some forms of originally 
semantic labour have been modelled as syntactic labour. Syntactic labour can be 
transferred from human activity to machine processes, including, but not limited 
to, an appropriately programmed computer (Warner, 2005). Copying could be 
regarded as the archetype of syntactic labour. Labour, process, and product can 
also be distinguished. For instance, labour would be the work involved in creating 
a concordance, process the algorithm for its creation, and product the concordance 
produced. 

A historical study of technologies for the production and reproduction, or copying, 
of written utterances, in predominantly administrative rather than public contexts, can 
reveal a dynamic in which direct human mental labour is progressively transferred to 
machine processes, although still with essential elements of human intervention. The 
productivity paradox can also be addressed. Costs and effects of transitions between 
states can be separated out from the equilibria obtained, by the expanded time frame. 
A rigorous understanding of information quantity, derived from information theory, 
and not confused with the exchange- or use-value of information products, can be used 
to reveal contrasts in information productivity, in its strict sense. Micro-level process, 
the adoption and use of technologies by individual agents, and macro-level results, 
particularly the availability of proven technologies and increases in social 
productivity, can be seen to generate each other. The concern is with syntactic 
labour, in its archetypal form as copying, although its progressive separation from 
semantic labour is acknowledged. In this context, the concern will be with information 
or data processing technologies, although similar dynamics, with highly similar, 
although not identical, chronologies, could be established for communication or 
message transmission technologies and for control mechanisms. 





Production and reproduction of written utterances 

The Neolithic revolution, with its transition from hunter-gatherer to agricultural and 
urban economies (the word, urban, has been traced to the Latin urbs or the mouldboard 
of the plough (Vico, 1744, p. 11)), compelled enhanced administrative and social 
organization for access to necessities for life (Hobsbawm, 1998, p. 215; Childe, 1936). 
Writing, and other forms of graphic communication (Warner, 2003), are essential to the 
organization of the elaborated polis. A document from the Egyptian New Kingdom 
contrasts scribal activity with physical labour: 


Put writing in your heart that you may protect yourself irom hard labour of any kind and bea 
magistrate of high repute. The scribe is released from manual tasks; it is he who commands. 
... Do you not hold the scribe’s palette? That is what makes the difference between and you 
the man who handles an oar. 


I have seen the metal-worker at his task at the mouth of his furnace with fingers like a 
crocodile’s. He stank worse than fish-spawn. Every workman who holds a chisel suffers more 
than the men who hack the ground; wood is his field and the chisel his mattock. At night 
when he is free, he toils more than his arms can (? at overtime work); even at night he lights 
(his lamp to work by). The stone-cutter seeks work in every hard stone; when he has done the 
great part of his labour his arms are exhausted and he is tired out. ... The weaver in a 
workshop is worse off than a women; (he squats) with his knees to his belly and does not taste 
(fresh) air. He must give loaves to the porters to see the light (quoted in Childe, 1936, pp. 187-8). 


At this stage, the scribe is “he who commands”, characteristically embedded in a 
priestly position, with the priestly merging into the royal (Childe, 1936). Intellectual 
work is regarded as a release from physical toil rather than as burdensome or 
laborious. 

The functions for writing which could be distinguished in modern practice are not 
yet separated out, but subsumed within a single set of activities. From a modern 
perspective, informed by an understanding of information technologies as control 
mechanisms (Minsky, 1967, p. vii), there is the strong possibility that graphic 
inscription is functioning as a control mechanism for embodied human mental labour 
to organize and direct human physical labour. Technically, in the classic sense 
developed in automata theory, the control mechanism is being used 
non-deterministically, with continuous human intervention (Minsky, 1967; Warner, 
1994). Semantic and syntactic aspects of writing are interweaved, with schools for 
writing functioning as analogues to scientific research institutes, although Old 
Kingdom manuscripts were copied (with inaccuracies) (Childe, 1936, pp. 189-91). 
Labour for the production of written utterances would be greater than for oral speech, 
both directly at the point of production and with regard to the additional preceding 
education required, but the written form allows for greater control of complexity and 
more exact replication over time (Warner, 2001, pp. 34-46). Natural materials are 
adapted for the humanly constructed technology, both for the substrate and the 
instrument of inscription. 

The dynamic which compelled the development of increasingly sophisticated forms 
of graphic representation is not that of capitalism but of a search for increased control 
over the environment. It has been argued that it is difficult to conceive of an overall 
narrative of human history which does take as its analytical framework, the “persistent 
and increasing capacity of the human species to control the forces of nature by means 
of manual and mental labour, technology and the organization of production” 
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(Hobsbawm, 1998, p. 215). Techniques are preserved and carried forward by active 
renewal, in a particular exemplification of the proposition, that men make their own 
history, by transforming their inheritance. 

A historical leap forward to a mid-nineteenth century United States legal office 
reveals a continuity in techniques and changes in function and status of participants 
(the possibility of the leap implies a “punctuated equilibrium” in the human 
development of information technologies analogous to that found in the evolution of 
natural forms). Written utterances in administrative contexts are still reproduced by 
writing by hand, although copy and original are compared, or collated, by an oral 
reading. 


It is, of course, an indispensable part of a scrivener’s business to verify the accuracy of his 
copy, word by word. Where there are two or more scriveners in an office, they assist each 
other in this examination, one reading from the copy, the other holding the original. It is a 
very dull, wearisome, and lethargic affair. I can readily imagine that, to some sanguine 
temperaments, it would be altogether intolerable. For example, I cannot credit that the 
mettlesome poet, Byron, would have contentedly sat down with Bartleby to examine a law 
document of, say five hundred pages, closely written in a crimpy hand (Melville, 1853, 
pp. 24-5). 


Intellectual and clerical roles are now separated from each other and the work involved 
in the reproduction and comparison of utterances is experienced as burdensome. 
Boundaries between roles and activities can be transgressed, as they are in “Bartleby”, 
but crossings are recognised as transgressions. The education needed for the higher 
status and better paid role could be regarded as the production costs of that labour, or 
as giving credentials to obtain higher market value (Collins, 1979). 

The continuity in techniques, which would not extend to identity in the materials 
used for the substrate or-in inscription, can be traced to preservation by renewal 
characteristic of pre-capitalist cultures: “earlier modes of production were essentially 
conservative’ (Marx, 1976, p. 617). Changes in function and status of scribal 
occupations could be ascribed to changes in social organization, including the diffusion 
of literacy under Protestantism and the separation of semantic and syntactic aspects of 
intellectual labour. Writing is closely connected with oral speech but its production 
would still be slower, demanding greater physical labour. 

The intensity of the clerical labour described by Melville hints at the possibility of a 
need, possibly an unarticulated need, for technological innovation. The later nineteenth 
century, particular in the United States was to see the diffusion of devices for the 
reproduction of written utterances in administrative contexts, the typewriter which 
offers greater speed and demands less labour at the point of production, and the stencil 
duplicator, which reduces human intervention in copying (Day, 1996, pp. 682-4). 
Techniques for transcribing oral speech, particularly systems of shorthand, were also 
increasingly adopted (Warner, 1993). Message transmission technologies, such as the 
telephone and telegraph, also proliferate (Ohlman, 1996; Warner, 2004, pp. 5-35). In 
comparison to printing from moveable type, the devices for the production and 
reproduction of written utterances in administrative contexts demand less capital 
investment, are more physically compact, and can be used to produce first copies 
faster. Earlier Babbage (1835, p. 70) had enumerated the various methods of printing, 
understood as an art of copying, printing from cavities, printing from surface, 
casting, moulding, stamping, punching, with elongation, with altered dimensions. 











The innovatory dynamic is now that of capitalism not just of increasing control over 
the environment. Capitalism, and technological innovation, occurred in a particularly 
pure and intense form in the late nineteenth century United States (“the age of 
invention”), with only a diminished inheritance from feudalism (Rosenberg, 1994, 
pp. 109-20; Marx, 1976, p. 1,014n). Individually and collectively, the ICTs adopted both 
reduce the direct human labour required for similar processes and expand the scope of 
human activities. 

A later nineteenth century fiction, Mark Twain’s A Connecticut Yankee in King 
Arthur's Court, first published in 1889, contrasts sixth and nineteenth century 
technologies, with the nineteenth century fictional locus situated in the intense period 
of Yankee innovation, after the initial diffusion of late nineteenth century innovations 
in ICTs. Technologies for the reproduction of written utterances for public contexts are 
counterposed, with a nineteenth century newspaper implicitly compared to a sixth 
century manuscript: 


A thousand of these sheets have been made, all exactly like this, in every minute detail — they 
can’t be told apart. 


A thousand! Verily a mighty work — a year’s work for many men. 
No ~ merely a day’s work for a man and a boy (Twain, 1889, p. 343). 


In accord with the labour theory of value, which informs the discursive components of 
the fiction, the primary explicit focus is on reduced labour in production. Secondarily, 
and with less explicit emphasis, the possibility of machinery enabling activities 
previously difficult, in this instance the exact reproduction of a large number of 
multiple copies, is recognised. At a later point in the narrative, a newspaper is received 
“damp from the press” and this could be understood as the product retaining traces of 
its process of manufacture. The errors in the sample pages given (Twain, 1889, pp. 342, 
495-6), for instance, of inversion, mixed fonts, and incorrect selections, are 
characteristic of type set by hand from individual characters. Mechanical methods 
for typesetting were beginning to be developed in the late nineteenth century United 
States and Twain was himself concurrently experiencing the difficulties and risks of 
transforming invention into innovation, in a failed investment in Paige’s typesetter 
(Kaplan, 1983, pp. 280-8). 

The Yankee period of invention occurs immediately prior to, and contemporary 
with, the convergence of science and technology and the transformation of invention 
into a largely corporate activity, in the late nineteenth century. It involved technologies 
less complex to manufacture than modern technologies. Factories produced multiple 
products: Remington, for instance, produced guns, sewing machines, and typewriters, 
with the design of typewriters influenced by the pre-existing sewing machines (Day, 
1996, p. 682). Knowledge of the various manufacturing processes could be mediated by 
a single human carrier, the Yankee transported from Hartford to Camelot. The sixth 
century actors in A Connecticut Yankee in King Arthur’s Court find it difficult to adapt 
to the technology imported: ' 


I never saw such an awkward people, with machinery; you see, they were totally unused to it 
(Twain, 1889, p. 437) 
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The observation on the awkwardness arising from lack of familiarity with technology 
could be regarded as an informal anticipation, but yielding fully comparable insight, of 
the need for learning and the difficulties associated with abrupt changes and 
discontinuities discovered in modern studies of the implementation of information 


systems (Brynjolfsson, 1993, p. 75; Brynjolfsson and Hitt, 1998, p. 54). 


Modern technologies for the reproduction of written utterances are sufficiently 
familiar not to demand detailed description, but do continue trends discernible 
historically. In copying by clicking a mouse, work is transferred from direct human 
labour to machine processes, although, crucially, human intervention is still required. 
Copies are exact replicas and data of file size and other attributes can be produced to 
compare copies for identity, or, at least, absence of discoverable difference. Collation 
between documents, or versions of a document, can be performed, with human 
intervention in initiation and interpretation. Interpretation remains a direct human 
activity. The process is more fully congealed in the product than with other forms of 
production: a file need not reveal, for instance, whether it has been directly typed or 
automatically copied from a previous file. Technologies for administrative and public 
purposes have converged more fully, following their beginning rapprochement in the 
late nineteenth century. 

A pattern of renewal and change can be detected. Continuity in techniques, for 
instance, inscription on a substrate, can mask changes in function, for instance, from 
preservation over time to transmission over space and time. The development of 
information technologies has been considered strongly analogous to evolution 
(Levinson, 1997), but it should be recalled that one is a natural effect and the other 
produced by human work on originally natural products. A highly specific analogy has 
been made here between punctuated equilibrium in the evolution of natural forms and 
in development of ICTs. A series of innovations can precede diffusion of a major 
innovation, analogous to the accumulation of contradictions before a social revolution, 
and this is exemplified by the late nineteenth and early and mid-twentieth century 
innovations in data processing before the diffusion of the computer as a universal 
machine. The transitional forms subsequently disappear from common use, although 
they can be recovered by historical enquiry. The historical process of development, 
yielding greater human organization and command over nature, is accelerated by the 
innovatory dynamic of capitalism. 

The productivity paradox can be both dissolved and explained by a consideration of 
the patterns revealed. If productivity is understood in a strict intra-process sense as the 
amount of written utterance produced per unit of human input, using an understanding 
of information quantity analogous to measures of quantity in classical information 
theory (Shannon, 1948; How much information, 2003), then it is clear that productivity 
has drastically increased over time. Direct human labour has been progressively 
transferred to machine processes, embodying accumulated human labour, and the 
machinery is transformed from a dead to an active state by human intervention. If a 
further, but still closely related, sense of productivity is used, in which the costs of 


_ different types of unit of human input is considered, then there has been a further 


increase in productivity, as the status and relative costs of clerical labour has 
diminished as its semantic components have been diminished. Productivity, in the 
dominant sense used in information systems, of use- or exchange-values produced, is 
itself diminished by the ease of reproduction of written utterances, by the increase in 








productivity in its intra-process sense. Comparative advantage is diminished by 
competitors having access to similar technologies. 

The productivity paradox, in the information systems sense, can then be understood 
_ as a productivity effect, corresponding to transition costs between social and 
technological states. Changes between technological states involve costs which may 
detract from immediate productivity. Such costs would include distraction from central 
productive tasks during transition and adaptation to the possibilities of the new 
technology. The empirical data covered by the productivity paradox has, then, been 
acknowledged, but a deeper level of explanation given. 


An information dynamic 

A dynamic can be constructed which accounts for the patterns observed, further 
explains productivity effects, and which links micro-level processes to macro-level 
results. 

The dynamic of conceptualisation, invention, and innovation for information 
technologies can first be considered. A human agent, which could be an individual, 
although still a social individual, or a group of individuals, informally or formally 
linked together, conceive of a technology. Human agents, which could, but need not, 
include those responsible for conceptualisation, then adopt the conceptualisation, as an 
object they desire to create, and work to transform the conceptualisation into an 
invention. In modern practice, this process is likely to involve collectivities working 
together rather than directly individual work. For information technologies in their 
material aspect, the process involves encounter with obdurate material reality, and, in 
their semiotic aspect, the transformation of logical or algorithmic abstractions into 
operational procedures, encountering semiotic complexity. Many conceptualisations, 
even if relatively formalized, may not be transformed into inventions. 

For conceptualisation and invention, human labour has to be diverted from socially 
necessary productive tasks. Transforming invention into innovation, turning 
demonstrated technical feasibility into a robust product which can work 
independently of its makers, involves further distraction from established forms of 
productive labour and increased encounter with material and semiotic complexity. The 
process of innovation is likely to involve greater numbers of people and to be more 
publicly visible. Marx was acutely aware of the risks of invention and innovation: 


The much greater costs that are always involved in an enterprise based on new inventions, 
compared with later establishments that rise up on its ruins, ex suis ossibus. The extent of this 
is so great that the pioneering entrepreneurs generally go bankrupt, and it is only their 
successors who flourish. .. Thus it is generally only the most worthless and wretched kind of 
money-capitalists that draw the greatest profit from all new developments of the universal 
labour of the human spirit and their social application by combined labour (Marx, 1981, p. 99). 


An agent which develops an innovation encounters risks but also obtains a 
comparative advantage over its competitors, characteristically by reducing costs by 
transterring production from direct human labour to machine processes. Many 
inventions will not be transformed into innovations. 

As the robustness and utility of the innovation is demonstrated in use, other agents 
may adopt the innovation, ensuring its diffusion. Adoption may be by deliberately 
parallel development or purchase (corresponding to the transition from developing 
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ICTs to purchasing packages). The comparative advantage attaching to the use of the 
innovation is then diminished: 
As machinery comes into general use in a particular branch of production, the social value of 
the machine’s product sinks down to its individual value, and the following law asserts itself: 
surplus-value does not arise from the labour-power that has been replaced by the machinery, 


but from the labour-power actually employed in working with the machinery (Marx, 1976, 
p. 530). 


Agents adopting innovations incur transition costs, which include the development or 
purchase costs of the innovation, diversion from central productive tasks, and adapting 
to the new technology and methods of working. As the innovation is increasingly 
adopted, it diffuses further, enhancing social productivity and providing the grounds 
for further conceptualisation and invention. 

The dynamic constructed can now be summarised, revealing its simplicity and 
power. An agent or agents develop a conceptualisation, transform it into an invention, 
and further into an innovation, incurring production costs. An agent then brings the 
innovation into use and encounters transition costs. Other agents adopt the innovation, . 
thereby diffusing it, but also incur transition costs, although less severe than those for 
initia! users. Diffusion effectively compels adoption of innovations by other agents, in 
order to maintain competitiveness. 

Micro-level processes and macro-level] results have, then, been linked, with 
adoptions of an innovation producing its diffusion. The productivity paradox can be 
understood as a productivity effect, corresponding to production and transition costs 
associated with the introduction of new technologies, and need not be allowed to 
obscure gains in social productivity. The enhanced level of productivity resulting from 
the diffusion of innovations in turn provides the ground for further innovation. 

Increases in productivity, in an intra-process and historical sense, enabled by 
innovation in and diffusion of information technologies can be exemplified by 
returning to technologies for the reproduction of written utterances. If the nineteenth 
century procedures of copying by hand and collation by oral reading were transported 
forward they would be hopelessly uneconomic. In real historical terms, scrivener, in the 
sense understood by Melville, may have disappeared as a profession and the term itself 
has become obscure. Conversely, if modern technologies were transported to a 
nineteenth century context, their comparative advantage would derive from the direct 
human labour they replaced. In a modern context, where other agents have access to 
comparable technologies, they do not confer a comparative advantage and value arises 
from human labour and materials directly employed. 

A dynamic has, then, been constructed which accounts for existing empirical data, 
by adapting powerful existing understandings to considering semiotic labour and to 
ICTs, and which explains patterns in that data, more fully than newly created 
observations and theories. 


Conclusion l 

The dynamic discovered for information technologies over human history, and, 
specifically, under capitalism, is similar to that already understood for industrial 
technologies, although for mental rather than physical labour. Direct human labour in 
productive processes is reduced, although not excluded, and replaced by machine 





processes. Gains in productivity in reduced time and costs expended in the 
construction of functionally similar products are obtained. 

The dynamic has implications at various levels of analysis, for information science 
and information systems as scholarly activities, for understandings of theory, for 
contrasts between information and capitalist societies, for the place of work in being 
human, and, finally, for the possibility of intervention in the dynamic. 

Information science has been characterized as a partly unreflexive response to 
post-1945 developments in telecommunications and computing technology (Warner, 
2004, p. 5). Information systems as a set of scholarly activities could be regarded as 
emerging from the need to develop and to situate computer based information systems 
in organisational contexts. In this respect, it was an epiphenomenon of the dynamic 
which produced the motivating technology. As the costs of purchasing systems have 
fallen below the costs of developing systems separately, and as systems have been 
assimilated to human consciousness, information systems’ role and utility has 
diminished. 

One response to the perceived crisis in information systems has been a resort to 
theory, conceived in relatively weak senses which can reproduce the divides, for 
instance, between technology and society, which motivated its production (Ciborra, 
2004). From the dynamic revealed here, the possibility of a convergence between three 
very powerful, established, and well-developed theories, is intimated, the theory of 
computability, classical information theory concerned with the limits and possibilities 
for signal transmission, and the Marxian perspective on human history. 

A fundamental continuity from capitalist to information societies, rather than a 
disjunction, is implied by the discovery of a common underlying dynamic. 
Sociotechnical action involving ICTs can be specifically understood in relation to 
established, and sophisticated, theories of innovation and its risks. For instance, 
difficulties associated with the introduction of systems were conceived as transition 
costs between states. 

For Freud, “love and work are the cornerstones of our humanness”. Yet, liberation 
from toil has been a central component of utopian visions (Mattelart, 1996, p. 282). If 
first physical labour and then aspects of mental labour are progressively transferred to 
technologies, how can the values derived from work, particularly of association and 
communal solidarity, be replaced? Release from forms of burdensome labour, both 
physical and clerical, could still be liberating. 

Finally, understanding a dynamic also gives the possibility of meaningfully 
intervening in the progress of that dynamic. The overall dynamic may not be easily 
amenable to individual or institutional modification — there is no powerful lever with a 
fulcrum outside the dynamic — but that it can be recognised and influenced and 
adjustment to it deliberately made. We might conclude by recognising the utility of a 
revealing theory. 
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Abstract 

Purpose ~ To examine the causes of knowledge management (KM) failure. 
Design/methodology/approach — A multi-case analysis approach was used to review five 
documented cases of KM failure in the literature. Categories of risk were identified through an iterative 
analysis of each case. 


Findings — There are four main categories of risk associated with KM failure, namely technology 
risk, culture risk, content risk and project management risk. The nature of these risks differs 
dependent upon the stage of a KM project. 

Research limitations/implications — A limited number of cases were reviewed. 

Practical implications — Practitioners need to proactively manage risk to avoid failure in KM 
projects. 

Originality/value — Proposes a taxonomy of KM risk. 

Keywords Knowledge management, Management failures, Risk management, Project evaluation 


Paper type General review 


Introduction 

Corporate spending on knowledge management (KM) projects has increased 
substantially over the years (Ithia, 2003). Fuelled by the notion that knowledge is a 
key resource upon which an organisation’s competitiveness depends (Kogut and 
Zander, 1992), organisations are implementing various KM initiatives to identify, share 
and exploit their knowledge assets. Most KM projects take the form of developing 
discussion databases, technical libraries, lessons learned database, starting 
communities of practice and transferring best practices. Several highly publicised 
KM success stories include Buckman Laboratories’ Knowledge Network (Zack, 1999), 
Xerox’s Eureka database (Brown and Duguid, 2000), Tech Clubs in DaimlerChrysler, 
the communities of practice among quantitative biologists in Eli Lilly (Wenger e al, 
2002), various KM initiatives in BP Amoco (Hansen, 2001) and the Center for Army 
Lessons Learned (Thomas et al, 2001). 

The purported benefits of KM improved decision-making, incr eased productivity, 
foster innovation, minimise reinvention and duplication, accelerate staff development, 
and lessen the impact of staff attrition would entice many a CEO. In some cases, the 
reported benefits from KM have been nothing short of remarkable. Xerox, for example, 
estimated to have saved $100 million from its Eureka database. 

However, despite the KM furore, it has been estimated that 84 per cent of KM 
projects exerted no significant impact on the adopting organisations (Lucier and 
Torsiliera, 1997). Rather worryingly, this suggests that most KM projects actually fail 
rather than succeed. Following Davenport et al. (1998), KM failure can be defined as a 
project which has few or none of the following characteristics: 





* growth in the resources attached to the project, including people and budget; 

* growth in the volume of knowledge content and usage; 

* a high likelihood that the project would survive without the support of a 
particular individual or two; and 

* evidence of financial return either for the KM activity itself or for the larger 
organization. 


There are obvious reasons why organisations would want to conceal KM failure, at 
least from public view. However, understanding the reasons for KM failure, and how 
KM projects can be better managed to avoid such failure, is an issue of high importance 
to the many organisations engaged in some type of KM activity. 


Aim of this paper i 

This paper seeks to examine the causes of KM failure through a multi-case analysis. 
On the basis of the findings, a taxonomy of KM risks is proposed. The taxonomy is 
intended to offer insights to practitioners and help them manage the risks associated 
with KM projects. Additionally, given the scarcity of research publications in KM 
failure, a secondary aim of this paper is to seed interest among the academic 
community in this area, and trail-blaze the building of a body of formal knowledge on 
KM risks. 


Research method 

In order to shed light on KM failure, several case studies of KM failure from the 
literature were analysed. An initial, prospective set of cases was drawn up by 
searching three popular online databases (ProQuest, Ebsco Host and Emerald) using 
the search terms “knowledge management”, “failure” and “abandonment”. Search 
results that were obviously inappropriate were discarded. Cases were then filtered 
according to two criteria: 


(1) the case was published in a peer reviewed scholarly journal, ensuring a certain 
level of case quality; and 


(2) the case provided sufficient contextual details about the KM project from 
inception to eventual abandonment. 


From the set of five documented cases that resulted from the filtering, the authors 
examined the circumstantial elements of the failure, including the rationale and 
intended objectives of the KM project, the outcomes of the project and the reasons that 
led to eventual project abandonment. 


Cases in KM failure 

Case 1: a global bank 

A global bank which spanned 70 countries decided to implement various KM projects 
after the departure of a major client who felt there was a lack of integrated services 
from the bank across divisions and countries (Newell, 2001; Scarbrough, 2003). The 
main objective of the KM project was to leverage on intranet technology to develop a 
global knowledge.network so that the services in the bank could be integrated. Among 
several independent intranet projects proliferated were OfficeWeb, GTSnet and Iweb. 
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OfficeWeb was abandoned even before it was rolled out. GTSnet held obsolete 
content soon after it was implemented. Iweb was more successful than the other two 
projects, but it failed to promote any sharing of knowledge within the IT division. 


Case 2: a pharmaceutical company 

An American owned global pharmaceutical company which specialised in high margin 
“lifestyle” drugs aimed to accelerate its internal drug development processes through 
overt KM initiatives. The management committed a substantial amount of political 
and financial resources to implement three KM projects: “Lessons”, “Warehouse” and 
“Electronic Café” utilising groupware technologies such as common repositories and 
discussion forums (McKinlay, 2002). 

“Lessons” yielded uneven results within three years of its implementation. 
“Warehouse” could not be adapted to the specific context of each workgroup, while 
“Café” was perceived to be exclusive, impractical and remote from reality. None of 
these KM projects had an effective mechanism to encourage participation or measure 
outcomes. 


Case 3: a manufacturing company 

A European manufacturing company that had more than 60 production units in some 
30 countries implemented three distinct KM projects, namely the “Production Project”, 
“Supply Chain Project” and “Design Project” (Kalling, 2003). The focus of “Production” 
was on capturing, documenting and sharing knowledge about production methods 
such as machine maintenance methods and safety prevention with the main aim of 
cutting production costs. “Supply” attempted to codify knowledge culled from 
customers, warehouse delivery centres, transporters and end consumers with the aim 
of enhancing product functionality and a better understanding of the effects of product 
design on the economics of transport and warehousing. The objective of “Design” was 
to improve structural product design so that designers could construct prototypes with 
minimal raw materials. 

Two years after implementation, “Production” was able to capture and transfer 
knowledge to the plant that needed it, but its aim to promote the application of the new 
knowledge resulted in a mixed level of success. Both “Supply” and “Design” were 
under-utilised and became obsolete after a while. 


Case 4: a European headquartered company 

The management of a European headquartered company commissioned a high profile 
KM team comprised of nine management staff to implement an organisation wide KM 
initiative (Storey and Barnett, 2000). The initiative encompassed a series of plans such 
as creating informative web pages of the management and all business units, 
organising staff into communities of practice and identifying internal knowledge 
champicns. 

As time passed, however, the team found out that the web site and intranet 
development were divided between the IT and media affairs departments. These two 
departments had diverging agendas and held conflicting views as to how the IT 
systems should be developed. They also suspected that the IT manager’s involvement 
in the KM initiative was purely political in order to gain a dominant position in the 
company’s strategy and budget planning. Meanwhile, external market conditions 





deteriorated and prompted the company to implement a major organisational 
restructuring exercise. The KM initiative faded and became lost in the turbulence. 


Case 5: a global company 

A global company lost a number of deals because of its inability to offer integrated 
solutions in the order handling line of business (Braganza and Mollenkramer, 2002). In 
response, the management commissioned a KM project known as Alpha. 
Underpinning Alpha was a comprehensive attempt to manage the knowledge across 
the company via a network of “Knowledge Enabled Worktables” to provide staff 
customised access to Alpha’s knowledge base. 

Owing to the teething problem of using new technology and the poor translation of 
design requirements to system functionalities, the IT team could not complete the first 
Worktable for the sales function on schedule. By the end of the year, the viability of the 
Worktable was in doubt. Given the high dependence and unsustainable expenditure on 
external IT resources, Alpha was perceived to be losing control over its IT related 
projects. Thus, the management curtailed the Worktable project and disbanded Alpha 
completely when it eventually lost faith in KM. 


Findings: a taxonomy of knowledge management risk 

The researchers identified and isolated risks that contributed to KM failure in each 
case. Through an iterative process of categorization, the risks were consolidated and 
organized into several categories on the basis of similarity. The result of our analysis is 
the KM risk taxonomy shown in Table I. The taxonomy looks at risk from two 
perspectives. 


(1) First, the phase in the project lifecycle at which the risk is most likely to occur, 
i.e. the planning, implementation, rollout or institutionalisation stage (Table I). 
KM projects are long-term organisational initiatives rather than a 
“quick-and-dirty” activity, and project risks can emerge at any point during 
the lifecycle. Not all KM projects necessarily reach the institutionalisation stage, 
and it is even conceivable that particularly poor projects may be abandoned as 
early as the planning stage. 

Second, whether the risk relates to technology, culture, content (or knowledge), 
project management or stakeholder. KM projects are not simply technology 


S 


Planning Implementation Rollout Institutionalisation 








Technology Technology Technical Poor KM tool usability Lack of scalability 


risk ignorance over-complexity and reliability 
Culture risk Techno-bias Organisational Knowledge hoarding Poor perceptions of 
mismatch and lack of knowledge knowledge reuse 
sharing 
Content risk Imprecise KM Poor content Lack of content Lack of knowledge 


problem structuring currency, relevance and distillation 
definition contextualisation 
Project Shortfall in Lack of user Haphazard rolout Lack of KM 
management expertise involvement and measurement and 
risk stakeholder conflict evaluation 


The 


mismanagement 


of KM 


427 


Table I. 
KM risk taxonomy 
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Table TI. 
KM project lifecycle 











Stage in KM project lifecycle 

Planning The need for KM is identified, and a dedicated team 
is assembled to address the organisation’s specific 
KM needs 

Implementation Users’ requirements are gathered and the necessary 
systems and processes are developed 

Rollout The KM project is formally rolled out to a part of or 
the entire organisation; this is the first time KM users 
interact with the “live” version of the systems and 
are engaged in the processes brought about by the 
KM project 

Institutionalisation The visibility of the KM team fades. The new KM 


systems and processes became part of the day-to-day 
routine of the organisation 





projects. There are cultural issues to consider related to the values of the 
organisation, content issues related to the characteristics of the knowledge 
itself, project management issues and stakeholder issues. 


The following sections of the paper describe the risks in the KM risk taxonomy in 
greater detail. 


Technology risk 

Most KM projects involve the application of technology such as the creation of 
discussion forums, knowledge repositories and intranets. The specific technology risks 
identified in our study were as follows. 


Technology ignorance 

There is no clear vision of how technology will be used to support KM. Either the 
technical vision is not clearly defined, or misaligned to the overall goals of the project. 
In case 2, the technical architecture of “Café” was relatively simple, but lacked 
appropriate collaboration functionality to support the development of communities of 
practice within the organisation. In case 4, the IT vision switched from being based 
around web sites to a document management system. 


Technical over-complexity 

A technical solution is designed that is more complex than what was really required. 
As a Gonsequence of technical over-complexity, the project team needlessly expends 
more time and effort, resulting in project slippage and additional project costs. In case 
3, “Design” was a highly sophisticated software system but it was largely neglected by 
its intended users. In case 5, the project was abandoned because the IT maintenance 
costs associated with Alpha became unsustainable. 


Poor KM tool usability and reliability 

The KM tool suffers from poor usability and the intended users of the tool, who may 
not be IT savvy, face a steep learning curve. Usage of the tool might also suffer from 
poor reliability due to software failure and bugs. In case 3 “Design” was perceived to be 
too cumbersome and difficult to be understood which impeded its widespread usage 
among designers. 





Lack of scalability 

The technical infrastructure is unable to support the required volume of users due to 
bandwidth and other technical limitations. High loads on the system affect 
performance and system responsiveness. Expansion of the knowledge base might 
also be limited. The failure in the OfficeWeb project in case 1 was attributed to a lack of 
bandwidth to support increased network traffic (although this problem was identified 
early on in the project). 


Culture risk 

Culture relates to the values and beliefs of the incividuals within the organisation, as 
well as the organisation as a whole. Unless an organisation has a “knowledge sharing” 
culture, a KM project is unlikely to succeed. The specific culture risks identified in our 
study were as follows. 


Techno-bias 

A technology-centric view of KM is taken where cultural, organisational and other 
softer aspects are ignored. For a long time, technology was perceived to be the panacea 
for all KM problems because it represents a highly tangible and visible solution (Silver, 
2000). Several scholars and practitioners have cautioned against excessive focus on 
technology (Davenport and Prusak, 1999; Nonaka and Takeuchi, 1995), arguing that 
technology is merely an enabler that supports KM efforts. In case 5 there was an over 
reliance on IT systems to manage knowledge in Alpha where tacit knowledge and 
behavioural issues received scant attention. 


Organisational mismatch 
The KM project is not grounded in the organisation's strategy or well- aligned to 


existing organisational structures and roles. As in case 4, when management face a. 
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crisis, KM is viewed as optional and “nice-to-have” rather than an integral part of how -. 
the organisation works. Employees are not formally assigned responsibility for KM ` 


activities nor are their performance judged on that basis. Invariably, KM inevitably 
ends up as a low priority activity. 


Knowledge hoarding and lack of knowledge sharing 

Employees do not share knowledge within the organisation, or even worse, exhibit a 
strong tendency to hoard knowledge. Employees see little motivation for knowledge 
sharing. In case 1, GTSnet failed to convince the users of the importance of the project 
to the success of the division, so was unable to change the users’ basic attitudes 
towards knowledge sharing behaviour. 


Poor perceptions of knowledge reuse 

Employees have poor perceptions of knowledge reuse. Knowledge reuse is frowned 
upon as a reflection of an individual’s own lack of creativity and innovation. In case 2, 
accessing “Warehouse” was perceived as a sign of inadequacy while contributing to 
“Warehouse” was perceived as a loss in personal expertise that raised concerns over 
job security. 
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Content risk 

KM projects revolve around the creation, capture and dissemination of knowledge 
content. The nature of the knowledge content, as well as its shape and form, will 
depend upon the specific project. The specific content risks identified in our study were 
as follows. 


Imprecise KM problem definition 

There is no clear definition of the KM problem being solved. In the absence of real KM 
users, the project team “fantasises” KM requirements based on their own opinions. 
Even when real KM users are available, there is no clear conception of the KM problem 
or well-articulated set of KM requirements. Consequently, there is difficulty 
understanding what type of content is needed. In case 3, when “Supply” was 
launched, it was under utilised because users found that the software merely provided 
them with information they already possessed. Moreover, “Supply” neither resulted in 
increased sales volume for sales staff nor helped the designers create better products. 


Poor content structuring 

The content is not structured in a format that is meaningful to the task at hand or is 
digestible to KM users. In case 5, critical knowledge that straddled across multiple 
functional groups was neglected and the content was developed fragmentarily from 
different groups of KM users. In case 2, the open-ended nature of “Café” made it 
difficult to locate important knowledge from a sea of discussion content. 


Lack of content currency, relevance and contextualisation 

The content is out-of-date, irrelevant or insufficient to the task at hand. In case 2, 
“Warehouse” could not be adapted to the specific context of each workgroup, and was 
thus deemed to be irrelevant to day-to-day operational processes. In case 3, designers 
neglected “Design” due to its ineffectiveness in helping to reduce the raw material costs 
and made little effort to update its knowledge base. 


Lack of knowledge distillation 

There is no effective mechanism to distil knowledge from debriefs and discussions. 
Important lessons learnt are not extracted or crystallised, so valuable knowledge 
remains obscured. In “Lessons” (case 2), there was no mechanism to sift through the. 
lessons compiled neither were there any opportunities to extend the scope of the 
exercise beyond existing procedures. In addition, the output from “Lessons” was a list 
of dissatisfaction with how standard operating procedures were applied rather than 
critical reflections on the procedures themselves. 


Project management risk 
KM projects, like any other project, can suffer from project management risks. The 
specific project management risks identified in our study were as follows. 


Shortfall in expertise 

The project lacks the required expertise to sustain the project, e.g. technical, business, 
organisational change or project management skills. In case 1, GTSnet was staffed by 
external IT consultants who did not possess the relevant business knowledge, and was 
unable to garner support internally to bring together the required technical and 





business expertise when launched. However, bringing external consultants onboard 
the project is not always helpful. In case 5, the engagement of three external consulting 
firms caused the KM project to meander and created confusion. 


Lack of user iiuchionant and stabolislicy conflict 

There is a lack of user involvement. Conflict exists between project stakeholders that 
are left unresolved. In case 1, GTSnet did not involve the targeted end-users during the 
implementation stage. More importantly, it failed to convince users of the importance 
of the project to the success of the division. In case 4, the KM team failed to manage the 
political processes between the IT and media affairs ‘departments which undermined 
the project from the start. Bg. i 


Haphazard rollout 

The KM project does not have a proper rollout strategy, or is hastily rolled out in an 
unready state. In case 4, the KM team spent little time deliberating on the potential 
barriers to the KM project. As such, teething problems plagued the rollout which could 
have been avoided if a pilot phase was considered. 


Lack of KM measurement and evaluation 

There is no systematic effort to track and measure the success of the KM project. 
Opportunities to publicise KM success stories are not seized. Conversely, there are no 
opportunities to correct mistakes. In “Production” (case 3), out of 40 plants studied, ten 
plants did not apply the new knowledge largely because they did not perceive a 
production performance gap in their plants. They were uriconvinced of the value 
created from applying the new knowledge although it was later discovered that the rest 
of the plants. which applied the new knowledge actually saw a significant improvement 
in their production performance. 


Conclusion 

Enticed by the plethora of success stories, many organisations embark on KM projects 
believing their well-intended efforts will naturally result in the better exploitation of 
knowledge assets for business benefit. While published cases of KM project failure are 
still rare, this paper has highlighted the multitude of potential risks with KM projects. 
Unless KM practitioners are aware and able to effectively manage such risks, KM 
project failure may becomie just as common as IT project failure. 


Implications for practitioners _ 
This paper offers four implications for KM practitioners. The first is related to 
technology. A technical individual should be appointed to the KM project team who is 
able to formulate a clear vision of how the technology will be used. A justification for 
the technology should be given to ensure that it is aligned to the goals of the KM 
project at large. As part of this, the total cost of ownership (TOC) should be calculated 
so that excessive technology spending can be curbed. Prototypes and preliminary 
testing should be conducted early on in the project to identify potential usability, 
reliability and scalability problems. Such tests should be conducted before 
commitment to any particular choice of tool or vendor product i is given. - 

The second is related to culture. The KM team should comprise a mix of individuals 
from both the business and technology side’to ensure a well-balanced perspective is 
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maintained. The personnel department should also be involved in the project as KM 
impacts on the personal development of employees. Management should acknowledge 
KM asa formal activity, and where appropriate, include it in individual work plans and 
performance objectives. The KM strategy and approach should be documented and 
presented to senior management to ensure buy-in and alignment to organisational 
goals. Champions should be identified who can demonstrate to staff how KM can 
improve individual and team performance. In some cases, management may wish to 
consider rewards (financial or otherwise) for contribution in KM activities. 

The third is related to content. A KM requirements document should be 
produced that describes the KM problem being addressed, the nature of the KM 
content needed to solve the problem and any required tools. The KM requirements 
document should be a key deliverable in the KM project plan and should be 
signed-off by the user representative before the project is allowed to move to the 
implementation stage. Prototypes should be conducted with users to ensure that 
the content is properly structured and relevant to problem solving. Organisational 
processes should be defined to address how knowledge within the organisation can 
be acquired and captured in the KM repositories in a timely manner. 

The final implication is related to project management. Key project positions and 
skills requirements should be identified, and staff recruited to those positions before 
the project formally begins. External expertise should be sourced if in-house expertise 
is unavailable. A small set of users from the user community should be identified and 
formally assigned to the project as user representatives. A project rollout plan should 
be created during the planning and implementation phases. If the knowledge base is 
shallow, or if there are concerns over the technology, then an incremental rollout plan 
should be adopted. A specific set of measurable success criteria should be drawn up 
before rollout. Such criteria might relate to the growth of the knowledge base or the 
level of usage of the KM system. Specific review points should be agreed where the 
management team are able to review the success of the project and, if needed, take 
corrective action. 


Implications for researchers 

Hitherto, with a sparing number of published KM failure cases, it is no surprise 
that taxonomies of KM risks have hardly been developed. This paper represents a 
first step in this area and invites more inquiry into KM failures. A number of 
future research directions can be conceived. The first is to improve the 
methodological rigour of the paper by selecting more case studies to validate 
against the model proposed. Other significant KM risks not already identified in 
this paper could be uncovered. If the cases are numerous and sufficiently diverse, 
a more robust taxonomy of KM risks that highlights the relative importance of 
each KM risk across dimensions such as industrial sector, size and organisational 
culture can be constructed. 

A second suggestion is to carry out primary data collection to statistically 
determine the relative contribution of each KM risk to the project failure. The 
model, which in its current form specifies KM risk on a simple matrix format, can 
be refined to better illustrate the web of causal relationships among the KM risks. 

A final suggestion for research is to develop a set of metrics to measure the extent to 
which success/failure is experienced as a KM project progresses through the stages in 


5 





the project lifecycle. The metrics will provide a more precise calibration of the health of 
the KM projects, and allow any slippages along the way to be detected early. Hopefully, 
by seeding the interest in the study of KM risks, more inquiry from the research 
community will be elicited. Over time, a body of formal knowledge on KM risks can be 
built. 
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Abstract 


Purpose — The generation of inverted indexes is one of the most computationally intensive activities 
for information retrieval systems: indexing large multi-gigabyte text databases can take many hours 
or even days to complete. We examine the generation of partitioned inverted files in order to speed up 
the process of indexing. Two types of index partitions are investigated: Termld and Docld. 


Design/methodology/approach — We use standard measures used in parallel computing such as 
speedup and efficiency to examine the computing results and also the space costs of our trial indexing 
experiments. 

Findings — The results from runs on both partitioning methods are compared and contrasted, 
concluding that Docld is the more efficient method. 

Practical implications — The practical implications are that the Docld partitioning method would 
in most circumstances be used for distributing inverted file data in a parallel computer, particularly if 
indexing speed is the primary consideration. 

Originality/value — The paper is of value to database administrators who manage large-scale text 
collections, and who need to use parallel computing to implement their text retrieval services. 
Keywords Information retrieval, Databases, Parallel programming 


Paper type Research paper 


1. Introduction 

The generation of inverted indexes for text databases is a computationally intensive 
process that requires the exclusive use of processing resources for long periods. The 
following considers techniques that could be used in order to speed up the generation 
of the initial inverted file. The research described in this paper is part of an overall 
effort to understand and quantify the effects that differing partitioning methods for 
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inverted files in parallel information retrieval (IR) systems have on the performance of 
indexing, search, passage retrieval and index update (MacFarlane, 2000). Two types of 
partitioning methods are investigated: term identifier (Termld) partitioning and 
document identifier (Docid) partitioning a partition is defined as the logical 
distribution of the inverted file. Termld partitioning is a type of partitioning which 
distributes each word to a single partition, while Docld partitioning distributes each 
document to a single partition. These partitions are fragmented across physical disks. 
A fuller discussion of these partitioning methods can be found in Jeong and Omiecinski 
(1995) and MacFarlane et al. (1997) and an example can be found in Appendix 1. Two 
types of index build methods are used: local and distributed. With local build, 
documents are kept on a local disk and analysis is done on that local disk only 
(Hawking, 1997): this method is applicable to Docld partitioning only. The distributed 
build method works by distributing the documents to nodes from a single disk. Section 
2 describes a re-configurable process topology used to create different types of 
partitioned inverted files. Sections 3 and 4 describe the individual components of this 
process topology, while the indexing methodology used for the experiments is outlined 
in Section 5, The hardware used for the experiments is described in Section 6 while the 
data used in the experiments is described in Section 7. In Sections 8 and 9 we describe 
some results on build methods using Docld partitioning and Termld partitioning, 
respectively, concluding in Section 10 by comparing and contrasting the results. We 
provide a glossary of terms at the end. 


2. Indexing topologies 
Our requirement for indexing topologies is to be able to support both partitioning 
methods under consideration as well as the two build strategies. The components of 
the topology must be reconfigurable in order to create different build types and 
numbers of inverted file partitions using different process combinations. Figure 1 
shows examples of both types of builds using the Docld partitioning method, together 
with process to processor mapping examples.: 

The local build method for parallel indexing is a very simple topology requiring 
little communication (Figure 1(a)). Each indexer node runs independently with no need 
for communication between them (the function of the indexer is described below). This 
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Figure 1. 
Build examples using the 
Docld partitioning method 
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Figure 2. 

Distributed build example 
using the Termld 
partitioning method 





form of build is applicable to Docld only. The distributed build method uses the 
process farm paradigm (Bowler et al, 1989) and an example of the one proposed for 
indexing is shown in Figure 1(b). The structure in the example consists of a farmer and 
n worker processes whose function is described below. Figure 1 shows the contrast in 
the build methods particularly with regard to the distribution of text to be indexed. The 
difference between the two methods is that text is kept locally when the local build 
method is used, and kept centrally on a single disk when distributed build is used (see 
Appendix 2 for an example of how this works). We use local build where a given 
collection could not be physically placed on a single disk (e.g. VLC2/WT100 g). 

Each method has its own advantages and disadvantages, and we leave the detailed 
discussion of such until later. The issue of communication is important here. It can be 
seen from both the diagrams and the descriptions above that some topologies will 
require a great deal more network resource than others. For example, distributed build 
methods will require more communication than local build indexing in order to 
distribute text. Figure 2 shows an index topology example for Termld partitioning. 


3. Distributed build topology components 

In this section we describe the functions of the farmer, worker and global merge 
parallel processes. Note that there is only one farmer processor, and a number of 
worker processes (which become global merge processes in Termld). Our reason for 
using this method is that it allows us to automatically distribute text to nodes: it has 
the disadvantage in that the method is more communication intensive than the local 
build method (Section 4). 


3.1 Farmer process 

The farmer’s job is to distribute documents to the workers (Figure 3). Essentially it 
distributes work as equally as possible to create the least amount of load imbalance (LD) 
possible. Single documents or files containing multiple documents can be distributed: 
the latter saves communication time. There is an initialisation stage where each worker 
is given its first initial document/file; after that workers are only given documents/files 
when they request them, i.e. send a message to the farmer asking for more work. When 
no more documents/files are left, a termination notice is sent to every worker process. 
Document identifiers are allocated individually if the granularity of parallelism is 
documents and in blocks if it is files. The document length cannot be recorded until the 
document has been analysed, and these data are sent to the farmer when a worker 





(a) Distribution of text phase (b) Distribution of data to partition phase 





Distributed Initial set of documents/files to all workers 


Loop no of files/documents 
get a request from worker i 
Case(request type) 
work request: send document/file to worker i 
id request : send block of document id’s to worker i 
EndCase 
EndLoop 
Loop until all workers have been terminated 
get a request from any worker i 
Case(request type) 
work request: send termination notice to worker i 
id request : send block of document id’s to worker i 
EndCase 
EndLoop 


requests further work: these data are saved to disk when received. In an attempt to 
keep workers load balanced a request for work is serviced as soon as possible after it 
has been received so that workers who index small documents or files are not kept 
waiting for too long. 


3.2 Worker process 

The worker’s function is to break down the document into its constituent parts, i.e. 
terms, and perform some analysis on these terms, eg. stemming using the 
methodology described below (Figure 4). If required, the position record is stored for 
each term using current values of accumulated data for field number, paragraph 
number, sentence number, word number and preceding stop words. After each word is 
found these values are updated. The worker creates and inserts this word/position data 
in a bucket: the method of storage for bucket elements is an AVL tree. In the case of 
Docld partitioning one bucket is used while 100 are currently used for Termld: words 
are hashed to a given bucket based on a dictionary (Cowie, 1989). The posting list is 
either created using the document identifier and the position record or updated by 
incrementing the number of positions and adding the position record to the position 
list. When any of the memory limits is reached, the results are saved on a temporary 


Loop until termination notice received 
Receive a document/file from the farmer 
Analyse document/file -> index 
If memory limits exceeded at any point during analysis 
then save index on disk 
Send request for work to farmer 
EndLoop 
If memory limits have not been exceeded 
Save index directly to create inverted file 
Else 
Save current index to disk. 
Merge data saved on disk to create inverted file 
EndIf 
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Figure 3. 
Farmer algorithm for 
parallel indexing 


Figure 4. 
Worker algorithm for 
parallel indexing 





438 


Figure 5. 

Global merge algorithm 
for parallel indexing 
(Termld only) 





file on disk for each bucket. A worker then requests work from the farmer and waits for 
a new document/file to analyse. A termination notice is received when there are no 
more documents/file to be processed and the worker either saves the inverted file 
directly from memory if the inversion has fitted into memory, or merges the 
intermediate results to create the inverted file. Where Docld partitioning is required the 
process can stop here, if Termld is required then a global merge is invoked. 


3.3 Global merge process 

This further process is only used for Termld partitioning (Figure 5). The global merge 
process has three phases; a heuristic is applied to choose the distribution of the files, 
the files are then transferred across the network to the required node and a second local 
merge is initiated to create the final inverted file. The heuristic in the first phase works 
by calculating the average value for each of the 100 partitions and attempts to derive a 
distribution of buckets amongst nodes that is within a given criterion, currently with 
10 per cent of the average value: up to five iterations are used. The average chosen for 
distribution is to prevent a node being overloaded with data, while iterations were 
restricted to ensure the process of allocating terms to nodes was fast. The average 
value can be one of three variables on a bucket; word count (WC), collection 
distribution (CF) and term distribution (TF): we refer to these as term allocation 
strategies. When the distribution is generated it is used to transfer the files for that 
bucket to the node that has been allocated that bucket: this is done by gathering 
from all processes to the target process. The merge is then initiated on those 
transferred files. 


4. Local build topology components 

4.1 Timing process 

The only central process for local build is the timing process: it waits until all indexer 
processes are finished and saves the total elapsed time for the build. Our reasoning for 
using this method is to examine the scalability of our parallel data structures and 


Worker i 


(Phase 1) 
Exchange word frequency data with all other workers 


Partition words amongst workers using required word distribution 
criteria (WC,TF,CF) 


(Phase 2) 
Loop no of partitions -> j 
If partition j belongs to worker i 
gather partition j data from all other workers 
Else 
Send partition j data to required worker 
EndIf 
EndLoop 


{Phase 3) 
Merge data for workers partition to create inverted file. 











algorithms: however, because of its minimal communication it is the one most would 
choose in many circumstances. 


4.2 Indexer process 

Each indexer process is a sequential index process that takes the function of the farmer 
and worker processes, i.e. it reads in documents, breaks them down, and adds them to 
the index creating intermediate indexes when a given set of criteria is met. The 
intermediate results are then merged to form one index for each node. The indexer 
process only communicates with the timing process when it has finished building the 
index: apart from that, its work is completely independent of any other process. 


5. Indexing methodology 

For each index build we used a stop word list of 450 words supplied by Fox (1990) to 
filter out unwanted terms. All HTML/SGML tags are stripped from the text and 
ignored if not used for specific reasons such as identifying paragraphs < p > and the 
end of document </DOC > . Each identified word was put through a Lovins stemmer, 
supplied by the University of Melbourne, and indexed in stem form. Numbers were not 
indexed. A large amount of in-core memory is pre-allocated in blocks by each indexing 
process, and documents are analysed until one of several criteria is reached: exhaustion 
of keyword block, posting block or position block space. When one of the criteria is 
satisfied, the current analysis is saved on disk as an intermediate index, so that the 
in-core memory can be used for the next set of documents. When all documents have 
been analysed, the intermediate indexes are merged together to create the final index 
and deleted. 


6. Hardware used 

PLIERS (ParaLLel Information rEtrieval Research System) is designed to run on 
several parallel architectures and is currently implemented on those which use Sun 
Sparc, DEC Alpha and Pentium PII processors. All results presented in this paper were 
obtained on an eight node Alpha farm and eight nodes of a 12 node AP3000 at the 
Australian National University (ANU), Canberra. Each node has its own local disk: that 
is a shared nothing architecture (DeWitt and Gray, 1992) is used by PLIERS. For the 
Alpha farm, each node is a series 600 266 meganertz Digital Alpha workstation with 
128 megabytes of memory running the Digital UNIX 4.0 bytes operating system. Two 
types of network interconnects were used: a. 155 megabytes per second ATM LAN with 
a Digital GiGASwitch and a 10 megabytes per second Ethernet LAN: most of the 
indexing was done on ATM. The Fujistsu AP3000 is a distributed memory parallel 
computer using Ultra 1 processors running Solaris 2.5.1. Each node of the AP3000 has 
a speed of 167 megahertz. The machine we used has 12 nodes, but only eight are 
available on a partition. The torus network has a top bandwidth of 200 megabytes per 
second. 


7. Data description 

We use a number of collections in our experiments: BASE] and 10 plus BASE2, 
BASE4, BASE6 and 8 that are subsets of BASE10. BASE] and 10 are officially defined 
samples of the 100 gigabytes VLC2 collection (Hawking et al., 1999) and are 1 and 10 
gigabytes in size, respectively. The subsets of the official BASE10 collection were 
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created by varying the number of BASE10 compressed text files put through the 
indexing mechanism (130 files per node for BASE2, 260 for BASE4, 390 for BASE6, 
520 for BASE8). Each of the BASE x collections is approximately x gigabytes in size. 

The strategy used to distribute the BASE] and 10 collections for local build was to 
evenly spread the directories (in which the data are distributed by the ANU) among the 
nodes as far as possible. An alternative if more time consuming strategy is to do it by 
file size. The requirement of a distribution strategy is to get the best possible load 
balance for indexing as well as term weighting and passage retrieval search. The 
distribution process was done before the indexing program was started, and is not 
included in the timings. 

Two types of inverted files were used for experiments: cne type that recorded 
position information (necessary for passage retrieval and adjacency operations) and 
one that recorded postings only in the inverted list. The conventional form of inverted 
file was used with a clear keyword and postings file split. A document map was also 
used to store data such as document length: this file is fragmented with local build and 
replicated with distributed build. Map data on distributed build with Docld could be 
fragmented, but we chose to replicate rather than maintain extra source code in order 
to save time. 


8. Index generation time costs 

In this section we declare the timing results on indexing using the configurations 
described above. The results are compared and contrasted where necessary as well as 
comparing them with available results for other systems on the BASE] and 10 
collections used in the VLC2 sub-track at TREC-7 (Hawking et al, 1999). We use the 
local build method on all defined collections, but only BASE] is indexed using the 
distributed build method. The measures discussed are: indexing elapsed time in hours, 
throughput, scalability, scaleup, speedup and efficiency LI and merging costs. Metrics 
used are defined in the glossary. Results on the Alpha farm and the AP3000 are 
discussed. 


8.1 Indexing elapsed time 
In general, the Alpha farm was much faster than the AP3000 for indexing elapsed time 
as its processors are faster. For example, on BASE10 local build indexing with 
postings only data took 0.82 hours on the Alphas and 1.08 hours on the AP3000 
(Figure 6). The Alpha elapsed times recorded on local build also compare well with the 
results given at VLC2 (Hawking et al, 1999). That is, on BASE] only two groups report 
slightly faster times than our posting only elapsed time of 0.065 hours (0.043 and 0.052 
hours). Our sequential elapsed time on BASE] at 0.56 (postings only) also compares 
well with those groups utilising a single processor: two other groups using 
uniprocessors recorded 0.42 and 1 hour, respectively (refer to Figures 7 and 8). On 
BASE10 on the Alphas the comparison is even more encouraging: only one group 
records a faster time of 0.504 hours. It should be noted that while the group with the 
fastest BASE10 indexing time uses a much smaller machine configuration (four Intel 
PI processors) they use a very different method of inversion in which the collection is 
treated as one document (Clarke et al, 1998). 

The results for distributed build indexing are shown in Figures 7 and 8. The elapsed 
times for Docld are much better than those for the Termld method. This trend can be 
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seen in all of the diagrams irrespective of machine or inverted file type used. The 
smallest difference is found on indexes with postings only using the AP3000. In 
general Termld elapsed times were longer than DocId because of the amount of data 
that has to be exchanged between nodes for tle method, particularly for indexes with 
position data. Very little difference in time was found in any of the term allocation 
strategies (Section 3.3) studied for Termld. 

One interesting factor found in the Termld results was that the AP3000 
outperformed the Alpha farm at seven worker nodes largely due to the extra network 
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Figure 6. 

BASE]1-10 local build 
(Docld): indexing elapsed 
time in hours 


Figure 7. 

BASE] distributed build: 
indexing elapsed 

times in hours 


Figure 8. 

BASE] distributed build: 
indexing elapsed times in 
hours (postings only) 
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Figure 9. 

BASE1 distributed build: 
indexing extra costs for 
storage of position data 


bandwidth available. It is that this point where the compute/communication balance 
favours the AP3000. A further run using distributed build with Docld partitioning on 
the Alpha farm revealed how much faster it is to use the ATM network than the 
Ethernet network: the time with ATM on two worker nodes building an index for 
BASE] with no position data were 0.27 hours, while the figure for Ethernet was nearly 
double at 0.47 hours. This comparison further illustrates the importance of network 
bandwidth to the distributed build method and which can cause problems in many IR 
tasks (Rungsawang et al, 1999). We did not conduct any further experiments on this 
type of build for indexing using the Ethernet network as a consequence. 

The extra time costs engendered by generating inversion with position data varied 
(this ratio is declared in the glossary our aim is to record a ratio as close to 1.0 as 
possible). For example, in local build Docld the difference between posting only 
generation and position data generation ranged between 1.09 and 1.37 times on the 
Alphas (where merging was required). The extra costs on BASE] are the highest (1.25 
for the AP3000 and 1.37 for the Alphas) because the index with postings only is saved 
directly to disk without the need for merging: merging is required only when memory 
limits have been exceeded. Figure 9 shows the ratios for distributed build experiments. 
How much these extra costs are justified depends on the query processing requirement: 
such as a user need for passage retrieval or proximity operators. 


8.2 Throughput 

The metric we use for throughput is gigabytes of text processed per hour (Gb/h) to 
compare performance between database builds. Figure 10 shows the throughput for 
eight processor configurations. The throughput for the Alphas is much faster than for 
the AP3000, e.g. on BASE] local build indexing with postings only the rate is 15.4 
gigabytes per hour compared with 9.5 gigabytes per hour on the AP3000. These are by 
far the best throughput results because no merging was needed: the configuration had 
enough memory to store the whole index and save it directly. The rate for other 
collections for local build indexing was 12-14 gigabytes per hour on the Alphas for 
postings only. Only one VLC2 participant recorded faster throughput for BASE1 and 
10 collections (just over 19 gigabytes per hour). The throughput on BASEL using 
distributed build Docld is not as good as the local build but is still encouraging 


(Figure 11). 
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It was found that increasing the number of worker nodes increased the throughput for 
both distributed build methods. For example, the Docld results for seven worker nodes 
yielded a throughput of 9.7 gigabytes per hour on the Alphas for postings only data 
indexes, compared with 1.8 for the uniprocessor experiment. The throughput for 
Termld builds was not as impressive but still acceptable with postings only: for 
example, 5.8 gigabytes per hour was recorded on the AP3000. The throughput. for 
builds with position data were not as good, with 4.5 gigabytes per hour on the AP3000 
(Figure 12). Note that we only declare results for TermId with the WC methcd as there 
is very little difference in measurement between any of the term allocation strategies 
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Figure 10. 
BASE1-10 local build 
(Docld): indexing 
gigabytes per hour 
throughput 


Figure 11. 

BASE] distributed build 
(Docld): indexing 
gigabytes per hour 
throughput 


Figure 12. 

BASE] distributed build 
(Termld): indexing 
gigabytes per hour 
throughput (WC only) 
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Figure 13. 

BASE2-10 local build 
(Docld): indexing 
scalability from BASE1 





studied. Note also the superior performance in throughput on the AP3000 at seven 
worker nodes due to the extra bandwidth available with that machine. 


8.3 Scalability 

The data measure used in the equation is the size of indexed text. The scalability 
metric is defined in the glossary. We measure the effect of increasing collection size on 
the same sized parallel machine using the BASE2-10 collections over the BASE] 
collection. We look for a scalability of around 1.0, greater than 1.0 being the aim. The 
results are shown in Figure 13. With postings only data the scalability ranges between 
0.80 and 0.93 on the Alphas and 0.92 and 0.99 on the AP3000. These figures are rather 
distorted because of the direct save on BASE], that is no merging was needed as 
memory limits were not exceeded. The results are on the pessimistic side (if more 
memory was available we might be able to save indexes directly on all the collections 
studied). In builds with position data the scalability is excellent with the Alphas 
registering super-linear scalability on most BASEx (BASE10 was the exception) and 
the AP3000 delivering super-linear scalability on BASE6, 8 and 10. The scalability 
results for indexes with position data demonstrate that the algorithms and data 


structures implemented are well able to cope with the extra computational load and . 


data size that such builds both require and process. 


8.4 Scaleup 

The scaleup metric is declared in the glossary. We measure within BASEx scaleup for 
local build only in this section. We take the times on each individual processor and 
compare the smallest elapsed time with the largest elapsed time on all eight nodes. We 
are comparing the smallest sub-collection of BASEx (1/8th of BASEx) with the full 
sized BASEx collection. We use the least favourable figure in our measurement to 
obtain the lowest scaleup from any of the chosen sub-collections: our measurements are 
therefore pessimistic. We look for a scaleup of around 1.0, greater than 1.0 being the 
aim. The results are shown in Figure 14. In general the scaleups recorded are very good 
with most above the 0.8 mark. The worst scaleup was measured over the BASE10 
collection on builds with no position data with a figure of 0.77. This figure was found 
on the Alpha farm where the processors are much faster. A combination of data size 
and processor speed can have an impact on scaleup: the scaleup figures for indexes 
with position data on the Alpha farm are generally superior to indexes without such 
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data. The situation is reversed for AP3000 where the processors are slower. These 
scaleup figures show that there is little deterioration in performance of our 
implemented data structures and algorithms when moving from a smaller collection 
indexing on a small configuration parallel machine, compared with a larger collection 
on a larger configuration machine. 


8.5 Speedup and efficiency 

All figures relate to the BASE1 collection. Definitions of these metrics can be found in 
the glossary. Recall that our ideal speedup is equal to the number of nodes, whereas for 
efficiency we look for a figure of 1.0. A surprising feature was the super-linear speedup 
and efficiency figures found with some of the indexing experiments particularly for the 
local build Docld eight processor runs (Table I). For example, with the direct save on 
postings only data local build on the Alphas yielded a speedup of 8.5 and efficiency of 
1.07. This effect was also found on some of the runs using distributed Docld indexing 
(Figures 15 and 16). 
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Figure 14. 
BASE1-10 local build 


(Docld): indexing scaleup 
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Figure 16. 

BASE] distributed build 
(@ocld): indexing 
efficiency 


Figure 17. 

BASE] distributed build 
(Termld): indexing 
speedup (WC only) 
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The reason this effect can occur is the extra memory multiple nodes have compared 
with a sequential processor, i.e. on local build with eight nodes the index fits into main 
memory and it can be saved directly without the need for merging. More memory 
reduces the number of intermediate results saved to disk and therefore saves I/O time 
when data are merged to create the index. On distributed build a two-worker 
configuration has twice the memory of the sequential program. The super-linear effect 
tails off at various stages on the distributed version as communication time becomes 
more important (Figure 15). 

With Termid communication is very important: the global merge reduces most 
speedup/efficiency measures to less than linear (Figures 17 and 18). With position data 
and Termld there is little speedup on the Alpha farm and efficiency ranges from the 
average to poor. Interestingly super-linear speedup/efficiency does occur on two 
worker nodes with builds on posting only data: further evidence of the significance of 
the memory effect. 


8.6 Load imbalance 

The LI metric we use is declared in the glossary the ideal load balance is close to 1.0. In 
general it was found that the distributed build imbalance was lower than those of local 
build (Figures 19 and 20). In fact distributed build using any partitioning method is 
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Figure 18. 

BASE] distributed build 
(Termfd): indexing 
efficiency (WC only) 


Figure 19. 
BASE1-10 local build 
(Docld): indexing LI 


Figure 20. 
BASE1 distributed build 
(Docld): indexing LI 
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Figure 21. 
BASE1 distributed build 
(Termld): indexing LI 


Table H. 

BASE1-10 local build 
(DocId): percentage of 
average elapsed indexing 
time spent merging 





excellent on all nodes with both methods, e.g. on 2-7 Alpha and AP3000 workers the LI 
was in the range 1.002-1.03 on average for Docld. The LI figures demonstrate that the 
imiplemented process farm method provides good load balance for indexing jobs when 
whole files are distributed to workers. 

The results for Termld were generally not as good as Docld, but good in the average 
case (Figure 21). The exception was for builds with position data on six nodes: LPs of 
1.2 for the AP3000 and 1.15 for the Alphas were recorded with WC distribution. The 
farm method described in Section 3 above is a very good way of ensuring load, balance 
in the majority of cases. The local build LI is still very good: the worst LI recorded was 
1.17 for BASE10 for the Alpha postings only run. We conclude by stating that both 
distributed and local build methods achieve good load balance, but local build LI could 
be improved by paying more attention to text distribution. 


8.7 Merging costs 

We consider here the percentage of time spent merging the temporary results to create 
the final inverted file: see the glossary for a formal definition we look for the lowest 
possible cost in percentage terms. We examine the Docld method first. The merging for 
local build was in the main consistent within a 1 per cent range, e.g. on the Alphas with 
posting data only, the average merge cost was 14-15 per cent (Table II). Merging costs 
for builds with position data were higher, e.g. on the Alphas the merge cost was 19-20 
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per cent. Merge costs on the AP3000 were lower on local build, e.g. with posting data 
` the average merge cost was around 13-14 percent. This difference is because the Alpha 
farm processors are much faster and therefore the I/O time (which remains constant) is 
more significant. 

With distributed build Docld build the merging costs were much the same as local 
build apart from Alpha builds with position data: the range found was 17-20 per cent: 
these costs did not vary much from the local build (Table M). The uniprocessor builds 
with position data registered the highest merge costs, whereas parallel Docld builds 
without position data saved indexes directly without the need for merging on eight 
processors. The merge costs were more prominent on the Alpha as the faster processor 
speed reduces the computational costs and increases the importance of I/O (merge is an 
VO intensive process). Merge costs are also more prominent on indexes which contain 
position data. 

Merge costs for Termld are very much higher as one would expect given the extra 
work required for merge with that method to exchange data between nodes (Table IV). 
These higher merge costs are a contributory factor in the overall loss of performance 
for Termld partitioning index builds. However, there is a distinct decrease in all cases 
of the significance of merging on the Alphas, e.g. merging on indexes with position 
data and WC word distribution decreased from 44 per cent at two workers to 30 per 
cent on seven workers. This is largely because the costs in transferring index data 
before the second merge can proceed increases with the numbers of worker nodes 
deployed, e.g. on the Alpha indexes with position data the increase is from 2 minutes at 
two workers to 4 minutes at seven workers. On the AP3000 a slight decrease in 
merging costs is recorded in most cases, and the decrease is not as pronounced as the 
Alphas. The Alpha’s extra processor speed brings benefit to extra merging found when 
building Termld indexes. The corresponding figure for transferring indexes with 
position data on the AP3000 ranges from 2.4 minutes with two workers to 2.9 minutes 
with seven workers. The AP3000 is better able to cope with this extra cost in 
transferring data for the second merge as it has extra bandwidth available in its 
network. 


8.8 Summary of time costs for indexing 

With respect to comparable metrics such as elapsed time and throughput, we have 
demonstrated that for a least one partitioning method, namely Docld, our results are 
state of the art compared with other VLC2 participants (Hawking et al, 1999). We have 
found that in most cases the Alpha farm outperforms the AP3000 except for some 
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Workers NPOS (per cent) POS (per cent) NPOS (per cent) POS (per cent) 
1 15 24 9 16 
2 15 20 9 14 
3 15 19 10 14 
4 15 20 10 14 
5 15 19 10 13 
6 14 18 9 13 
7 13 17 9 14 


The generation 
of partitioned 
inverted files 


449 


Table M. 

BASE] distributed build 
(ocld): percentage of . 
average elapsed indexing 
time spent merging 
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Termld runs: the AP3000 has a much higher bandwidth network available to it that is 
an advantage in such builds. Comparing the partitioning methods we have found that 
builds using the Docld method outperform index builds using Termld in all 
experiments. Our speedup and efficiency figures show that the methods of parallelism 
do bring time reduction benefits, particularly for the Docld partitioning method. The 
scalability and scaleup figures show that our implemented data structures and 
algorithms are well able to cope with increasingly larger databases on a same sized or 
larger parallel machine. The LI is generally quite small for all runs. The extra costs for 
generating indexes with position data vary, but are not an insubstantial part of the 
overall costs. Merge costs are also an important element of total time, depending on the 
build and partitioning method used. 


9. Index file space costs 

In this section we declare the space overheads using the configurations described 
above. The results are compared and contrasted where necessary as well as comparing 
them with overheads on the BASEL and 10 collections used in the VLC2 sub-track at 
TREC-7 (Hawking et al, 1999). The space overheads discussed are: overall inverted file 
space costs, keyword file space costs and file space imbalance. 


9.1 Inverted file space costs 

The metrics we used here are the file sizes in gigabytes and percentage of original text 
size. The space costs for local build indexes are fairly constant in percentage terms 
across all collections (Figure 22(b)), although a slight reduction in index size compared 
with the size of the text can be seen in Figure 22(a). This reduction occurs irrespective 
of the type of data stored in the inverted file. From Figure 23 we can observe that there 
is a slight increase in index size for increasing the processor set when using distributed 
build methods. The reason for this is because of the replicated map requirements of 
distributed builds. The increase is more marked for Docld partitioning. If the map file 
size is taken away from the total size then the Docld indexes increase is much smaller 
(the reason for any increase at all is explained in Section 9.2). 

The comparison with space costs of the VLC2 participants (Hawking et al., 1999) is 
favourable with postings only data: our smallest figure of 0.11 gigabytes on BASEL 
was smaller than all submitted results and on BASE10 only one VLC2 participant at 
0.902 gigabytes was smaller than our figure of 1.1 gigabytes. The comparison with 
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Figure 22. 
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files that contain position data are not so good and our smallest figure of 0.31 gigabytes 
for BASE] is bested by two groups, while on BASE10 three groups record a smaller 
figure than our 3.0 gigabytes. 


9.2 Keywords file space costs 

The metric for keyword file space costs is the size in megabytes and the keyword file 
percentage of the total inversion. With local build on both postings only and position 
data we found that the trend in keyword space costs was a decreasing one, e.g. from 32 
per cent on BASE] to 22 per cent on BASE10 with postings only data (Figure 24(b)). 
This is because the increase in lexicon is not linear with the increase in collection 
(Figure 24(a)). With distributed Docld indexes the keyword costs remain constant, e.g. 
24-26 per cent (Figure 25(b)). The size of the keyword file actually increases with more 
inverted file partitions (Figure 25(a)), but this increase is not significant and is 
absorbed by the increase in size of the replaced document map. We state that there is 
little extra cost in having words replicated across different fragments for Docld 
partitioning on this type of collection (web data). For TermId indexes the size of the 
keyword file was constant irrespective of term allocation method, and if the map data 
are included in costs the significance of the keyword file with respect to the total index 
size gradually decreases (Figure 25(a) and (b)). 


9.3 File load imbalance 

We use the concept of LI but apply it to file sizes instead, i.e. maximum file size/average 
file size. We wish to ensure that index data are fairly distributed amongst nodes, e.g. it 
would not be desirable for one index partition to exceed the space available on a 
physical disk. The index time LI results are included in Figures 26-28 for comparative 
purposes. The space imbalance for text space costs was in general fairly stable being in . 
the range 1.04-1.02 for all local build indexing runs (Figure 26). In comparison the 
inverted file imbalance was much higher, particularly for the smaller collections. 
Clearly the imbalance stems not from the size of the text, but from aspects of the text 
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such as the number of documents and total word length of the text. In contrast the 
space imbalance for distributed build on Docld partitioning was small for any type of 
inverted file data storage (Figure 27). There is no significant difference between the 
space imbalance of inverted files and LI for indexing times with Docld partitioning. 
The file space imbalance figures demonstrate the validity of the farming method for 
balancing load for Docld partitioning. 

The situation for Termld varies depending on the type of word distribution method 
used (Figure 28). For the WC distribution space imbalance was generally very poor, 
with the worst being indexes with position data on six worker nodes: an imbalance of 
1.52 was recorded (interestingly the worst imbalance for indexing times, see Figure 28). 
The figures for the collection frequency distribution method (CF) are much better with 
an imbalance range of 1.02-1.07 for all builds. In the term frequency (TF) method the 
imbalance was erratic being very poor at five and six worker nodes for any index 
builds, but good on all other runs. Any imbalance in space does not affect 
computational imbalance adversely. None of the Termld space imbalance results are as 
good as the Docld for space costs on distributed builds, as it is much harder to derive a 
good data distribution method for Termld indexes (the allocation of terms to nodes is a 
more difficult problem than allocating documents to nodes). None of the methods 
implemented affect space imbalance such that an index partition exceeds the physical 
disk of any node. 
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Figure 24. 
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Figure 25. 
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9.4 Summary of space costs for indexing 

Overall space overhead for the indexing is state of the art and comparable with the 
results give by VLC2 participants: at least for indexes with postings only. The 
distributed build Docld results show that the cost of storing keywords does grow with 
increasing the fragmentation, but given that local build results show that space costs 
decrease with database size we do not see this as a serious overhead for the Docld 
partitioning method. The space costs imbalance for local build is generally quite stable, 
but the generated inverted files vary more. Clearly, the consideration of the number of 
files on its own is not sufficient to ensure very good balance. For distributed builds 
space imbalance was much smaller, except for some TermId indexes where 
distribution methods are more difficult-to derive: no index partition exceeds the size of 
a node’s local disk. 


10. Conclusion 

The results produced in this paper show that of the partitioning methods, Docld 
partitioning using any build has by far the most promise and would in most 
circumstances be the method chosen for indexing. This would be the case 
particularly if the collection under consideration needed frequent re-builds. We 
have used the Docld method to good effect in the web track for TREC-8 on the 
full 100 gigabytes VLC2 collection (MacFarlane et al., 2000). Where disk space was 
limited, the local build method could be used to good effect: we used this build 
methed on the BASE10 as we did not have sufficient space to do distributed 
builds on that collection. We have demonstrated that indexing is state of the art in 
both compute and space terms by comparing our space and time results with 
those given at VLC2 (Hawking et al, 1999) and the TREC-8 web track (Hawking 
et al, 2000). Although we did not produce the best results for all measures, no 
group at VLC2 did either. Our indexing time for the full 100 gigabytes collection 
was the best in the web track (MacFarlane et al, 2000). 

A clear distinction must be made between Docld and Termld partitioning methods. 
Distributed build Docld out-performs Termld in all areas of time cost metrics and 
would therefore always be preferred if indexing was of primary concern. We state this 
irrespective of the type of inversion or algorithms/methods used if cluster computing is 
utilised. We would recommend that Termld only be used if two main criteria are met. 
One is that a high performance network is available to reduce time spent on 
transferring data during the global merge process. The other is that some other benefit 
must accrue from the use of Termld partitioning which in essence would be some 
advantage in search performance or index maintenance criterion over the Docld 
method. 


Glossary 

CF allocation == method of term allocation in Termld to partition using a collection 
frequency criterion 

Distributed build = method of building indexes where text is distributed from a single node 

Docla = partitioning method which assigns all document data for a given 
document to one index partition 

Efficiency =" measure of the effective use of processors. Definition: 
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Examples of build methods for distributed inverted files 
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Figure A2. 
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Creative information seeking 


Part I: a conceptual framework 
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Abstract 


Purpose — This paper proposes a conceptual framework for creative information seeking drawing 
upon Weisberg’s argument that creativity exists in everyone, and mapping the creative process 
described in the holistic model of creativity to the information seeking activities identified in the 
behavioural model of information seeking. 


` Design/methodology/approach — Using scenarios of information seeking behaviour, mappings 


between the creative process and information seeking activities were refined and six stages for 


‘creative information seeking were proposed. Scenarios were also used to provide theoretical 


justifications for stages in creative information seeking. 


Findings — Evidence gathered from the scenarios seemed to indicate that the type of information 
seeking task may have an impact on the extent to which an information seeker exhibits all stages in 
the framework. This is on-going research. Part II of this paper aims to cenduct empirical studies and 
gather evidence to verify the framework and examine this observation in more detail. 


Originality/value — Proposes a framework for creative information seeking. 
Keywords Information seeking, Creativity, Information retrieval systems 
Paper type Conceptual paper 


Introduction 

Current information retrieval (IR) systems are designed to judge precision and recall 
based on a match between index and query terms. This mode of operation is the 
“best-match” principle (Belkin et al, 1982). Precision and recall measures using this 
principle are limited due to its assumptions. 

The “best-match” principle assumes that users can specify their information needs 
in a query. Unfortunately, this is difficult as users are unfamiliar with the system’s 
operations, vocabularies used to describe its documents, and characteristics of 
information in the system (Belkin, 2000). Hence, documents retrieved may be irrelevant 
to users. IR systems built using the “best-match” principle also assume that the 
perspectives and words used to describe documents are similar between indexers and 
users. This assumption limits precision and recall measures, as indexers are experts in 
their subject areas and describe documents based on their properties (Henninger and 
Belkin, 1996). Users,.on the other hand, are less knowledgeable and hence use 
terminologies that differ from experts. The third limitation of the “best-match” 
principle is the assumption that documents retrieved are relevant to the user. Such a 
measure of relevance is limited, as it does not consider the contextual nature of human 
judgement. Measures of precision and recall should take into account that relevance is 
subjective (Kuhlthau, 1993) and is influenced by the knowledge states and intentions of 
users (Case, 2002). 


al 





Weisberg (1993) argues that everyone has creative traits and creativity is a result of 
ordinary thinking, which is a continuity of the past. Individuals deal with new 
situations based on prior experiences in similar situations. Likewise, Amabile (1990) 
argues that the creative process used by individuals is similar regardless of the 
domains they come from. Using Weisberg’s (1993) argument, an assumption can be 
made that every information seeker is creative and hence engages in creative 
information seeking. 

Establishing a relationship between “creativity” and “information seeking” is 
important as it provides a different perspective of information seeking leading to new 
and perhaps improved ways of developing systems and interfaces for supporting 
information seeking. Such a perspective sees a creative process as inherent in 
information seeking and may, therefore, improve query formulation and refinement, 
searching, browsing, and filtering, thereby, providing better system support to help 
users better understand their information needs. Using the assumption made earlier, 


this paper proposes that examining information seeking from a creative perspective a: $ 


may possibly help address some of the limitations of current IR systems. 


In order to establish a relationship between “creativity” and “information seeking”, ~ 


this paper begins with a survey of creativity and information seeking models. Based on 
the survey, stages to creative information seeking are formulated. These stages are 
established through a mapping of a creativity model and an information-seeking 
model. Scenarios of information seeking are then used to clarify these mappings so that 
the model for creative information seeking could be conceptualised. 


Survey of creativity models 

Creativity means different things to different people. One definition associates it 
with the genius. Weisberg (1993) highlights that attempts to understand creativity 
is largely dominated by the “genius” view, which associates creative achievements 
as results of great individuals using extraordinary processes. However, he argues 
that creativity is a trait everyone has and that novelty results from the use of 
similar thinking processes when a particular person is put in a particular 
environment. He emphasizes that his argument is not trying to claim that great 
individuals who produce original works are the same as ordinary individuals but 
that the thinking processes used are similar. 

Other definitions of creativity associate it with process-oriented or product-oriented 
views. The process-oriented definition highlights creativity as a process that results in 
innovative products (Edmonds and Candy, 2002; Kazanjian et al, 2000). 
Product-oriented views associate creativity to attributes of the outcome. Only when 
an outcome is both novel and valuable can creativity be said to have happened (Akin 
and Akin, 1998). 

Creativity can also be defined using a number of models. Models described briefly in 
this paper include the systems’ view of creativity, the componential model of a creative 
process, and the holistic model of creativity. 


Systems’ view of creativity 

The systems’ view of creativity argues that creativity is a result of social systems 
making judgements of the individual (Csikszentmihalyi, 1990). Here, creativity is a 
result of an interaction between three subsystems: 
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Figure 1 


Roles and relationships of 


each subsystem 





(D a domain; 
(2) a field; and 
(3) a person. 


The roles and relationships of each subsystem are shown in Figure 1. In this model, no 
creative product exists without an input from each of the subsystems.. 

-Figure 1 also shows that a typical cycle of the model starts with the domain, which 
transmits information to the individual. The individual receives the information and 
transforms it. The field then makes a judgement of whether the transformation is 
valuable to society. If it is valuable, it would be included in the domain of knowledge, 
hence providing a new starting point for another cycle of transformation and 
evaluation (Csikszentmihalyi, 1990; Saunders and Gero, 2002). 


Componential model of a creative process 

Amabile’s (1990) componential model of a creative process focuses on stages a creative 
person undergoes to produce a creative product and elements in the environment that 
influence each stage. Stages of the creative process in this model are task 
representation, preparation, response generation, response validation, and outcome 
evaluation. Elements in the environment that influence these stages include: task 
motivation, domain-relevant skills, and creativity-relevant skills. Figure 2 shows the 
flow of these stages and how elements in the environment influence the creative 

rocess, 

' Amabile (1990) reveals that the outcome of one cycle of the creative process has a 
direct influence on all three environmental elements in future engagements of similar 
tasks. Hence, a feedback loop is established. Moreover, the application of this model is 
dependent on the complexity of the task. This means that for a complex task, several 
long loops through the stages may be necessary before a successful product is 
achieved. 


Domain 


— Domain preserves desirable performances selected by the field based on 
tules representing thoughts and actions specified in the domain 


Person 


| ~ — The creative person’s function is to produce creative products. He/she tries 
to convince the field that the product created is worthy for the domain. 


Characteristics of a worthy product are determined by the other two 
subsystems 


Field 


— Field consists of people who act as gatekeepers. These people judge if a 
product created meets the criteria to be included in the domain j 





‘be 





STAGES 





























Preparation Response Response Outcome 



























































Generation Validation Evaluation 
Individual Individual - Domain- - Evaluate 
gathers searches relevant outcome 
intrinsic information for through all skills are based on 
sae i used to validation | 
motivation actual possible à done in : 
triggers generation of pathways to validate stage 4. 
the responses/ generate value, - Tf outcome 
creative products responses/ correctness fulfilled, 
process products and novelty process ends. 
of a product - If outcome 
failed, 
process ends, 
a - If outcome 
partially 
fulfilled, 
process 


returns to I 





(1) ; G) 
Task Motivation Domain-Relevant Skills Creativity-Relevant 
Extrinsic and/or - Skills determine the Skills 
intrinsic motivation approaches an - Variables of personality 
. triggers the creative individual takes and the and individual 
process criteria he/she uses to differences 
assess the new response facilitate/hinder the 
generated creative process across 
all domains 
| ee ee e m ml We ee we eee eee ee ee eww ee dew m eee eww eee ANY 1 
Holistic model of creativity 


Rhodes (1961) attempted to find one unifying definition of creativity but discovered 
that creativity is subjective and difficult to define. When analysed, the content of 
definitions form four fundamental areas of inquiry: 

(1) a creative person; 

(2) process; 

(3) product; and 

(4) environment. 


Similarly, Mooney (1963) and Weisberg (1993) highlighted that a description of 
creativity should encompass these elements. Such a description is termed here as the 
holistic model of creativity. This model will be used to establish a relationship between 
“creativity” and “information seeking” as it recognises creativity from all aspects 
encompassing the person, process, product, and environment. 

Synthesizing the different views for creativity (Mooney, 1963; Rhodes, 1961; 
Weisberg, 1993), this paper identifies attributes of a holistic model of creativity 
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Figure 2. 
The componential model 
of a creative process 
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Figure 3. 
The creative process 


I. ; 
focusing on four fundamental aspects of a creative person, process, product, and 
environment. 


The creative person 

The creative person in this. model should have imagination, independence, and 
divergent thinking (Diakidoy and Kanari, 1999). The ability to see interrelationships is 
a key feature and helps a person break away from conventional ideas when one faces a 
dead end. King and Pope (1999) also associate the creative person with psychological 
richness, complexity, and openness to experience. 


The creative process 

In order to produce a creative product, the creative individual goes through four stages: 
preparation; incubation; illumination; and verification (Gabora, 2002). The flow and 
description of these stages are shown in Figure 3. 


The creative product 

The creative product should satisfy two properties: novelty and value (Weisberg, 1993; 
Akin and Akin, 1998; Fischer and Nakakoji, 1997). Novelty is determined by 
comparing the new product with existing ones (Akin and Akin, 1998; Fischer and 
Nakakoji, 1997). In other words, a novel product is one that is different from all 
previously created products for similar purposes. Value is concerned with the 
relevance of the product to human purposes (Akin and Akin, 1998). Amabile (1990) also 
explains that judgement of a creative product depends on a group of appropriate 
observers. 


l 
The creative environment 
Attributes of a creative environment include boldness, courage, freedom, spontaneity, 
clarity, and self-acceptance (Maslow, 1959). The environment should also support 
collaboration since idea generation does not occur in isolation but rather grows out of 
relationships between individuals (Drazin et al, 1999; MacCrimmon and Wagner, 1994). 


Survey of information seeking models 
In the previous section, different definitions and modéls of creativity are surveyed. In 
this section, three established information seeking models are discussed to provide 
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rationale to underpin possible mappings between “creativity” and “information 
seeking”. These information seeking models are Wilson’s (1981) model of information 
seeking behaviour, Kuhithau’s (1993) model of the information search process, and the 
behavioural model of information seeking (Ellis, 1989; Ellis et al, 1993). 


Model of information seeking behaviour 
Wilson's (1981) model is a macro-model of information seeking. It focuses specifically 
on motivation factors that trigger information seeking behaviour, elements in the 
environment that affect these motivation factors, and barriers to information seeking 
behaviour. 

In this model, information seeking behaviour/activity is triggered by three 
motivation factors: 


(1) physiological needs (that is, survival needs for food, water, and shelter); 


(2) affective/psychological/emotional needs such as the need for attainment and 
domination; and 


(3) cognitive needs such as the need to learn a new skill. 


These motivation factors are also influenced by the social role of an individual. This 
social role refers to the role an individual performs in life or work. The model also goes 
on to highlight that the three motivation factors do not necessarily trigger information 
seeking behaviour, as other “external” factors may hinder it. The model does not 
explicitly highlight what these “external” factors are but broadly mention them as 
personal, interpersonal, and environment barriers to information seeking. 

Essentially, this model shows that information needs and information seeking 
behaviour is a function of needs, social role, and the environment. Thus, information 
needs and information seeking behaviour are subjective as everybody’s roles and 
environments are influenced by different factors. 


Model of the information search process 

In contrast to Wilson’s (1981) model, Kuhithau’s (1993) model focuses on the feelings, 
thoughts, and actions associated with different stages in the “information search 
process”. These stages are initiation, selection, exploration, formulation, collection, and 
presentation. 

The model highlights that at the start (initiation), thoughts are vague and confused. 
As the information seeker goes through the stages from initiation to presentation, 
thoughts become clearer as he/she gains a personal understanding of the problem. At 
the same time, his/her feelings begin to evolve from anxiety and confusion to increased 
confidence and interest as he/she progresses through stages of the information search 
process. A detailed description of the feelings, thoughts, and actions associated with 
each stage in this model’s information search process is shown in Figure 4. 


Behavioural model of information seeking 
The third information seeking model that is examined to help identify possible 
mappings between “creativity” models and “information seeking” models is, the 
behavioural model of information seeking (Ellis, 1989; Ellis et al, 1993). 

The behavioural model of information seeking focuses specifically on the generic 
categories of activities in information seeking. It takes mto account the subjective 
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Figure 4. 
Model of the information 
search process 
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sequence of such activities, which varies according to the context and circumstances of 
the seeker at that point of time. This model is used to establish a AE between 
“creativity” and “information seeking”. 
In this model, the information seeker undergoes eight generic activities. 


* @) 


(2) 


The first activity is starting. This refers to activities characteristic of the initial 
search for information on a new topic/area. The seeker may have some or no 
familiarity with the topic/area. Examples of these activities include asking 
colleagues, consulting literature reviews, referring to starter reference 
catalogues, abstracts, and indexes. 


The second activity is chaining. This refers to following ,chains of citation 
connections between materials, which can be forward chain ..g and backward 
chaining. Forward chaining involves the use of citation indexés or bibliographic 
tools to identify citations to relevant materials. Backward ;chaining refers to 
following up references cited in materials consulted. l I 


The third activity, browsing, involves conducting” semi-structured or 
semi-directed searching in an area of potential interest. Activities in this 
stage include browsing table of contents in journals, checking sources available 


` in the library, and browsing along shelves. 


(4) 


The fourth activity is differentiating. This activity is concerned with using 
characteristics/differences in information sources to filter the amount of 
information obtained. Some characteristics for filtering include topic of study, 
creditability, author, type of source, and language. 


(5) Monitoring, the fifth activity, is about continuously maintaining awareness of 


developments in a field through regular monitoring of particular sources. 
Methods that information seekers use include the use of informal contacts,’ 





monitoring services, research directories, and journals and publishers’ Creative 


catalogues. information 
(6) The sixth activity, extracting, is concerned with systemically working through seeking 

a particular source to selectively locate materials of interest. This activity 

requires the seeker to set aside a substantial amount of time to go through 

sources, like journals, books, computer databases, and indexes to locate 

information of interest. This activity is the most directed among all information 467 

seeking activities. So ee te 
(7) The seventh and eighth activities are verifying and ending, respectively. 

Verifying involves checking the accuracy of information. Ending includes 

activities characteristic of information seeking found at the end of a 

topic/project, for example, returning to the literature during preparation of 

papers so that the accomplished work can be discussed in light of related works. 


Table I summarises activities in this model. 


Stages for creative information seeking 
In this section, mappings between the holistic model of creativity and the behavioural 
model of information seeking (Ellis, 1989; Ellis et al., 1993) are proposed. The holistic 
model of creativity and the behavioural model of information seeking were selected to 
establish mappings as they seemed to investigate, in detail, the creative process and 
the information seeking process, respectively. 


Establishing mappings between “creativity” and “information seeking” 

The survey of creativity and information seeking models highlighted several 
similarities between the “creative process” in the holistic model and the behavioural 
model of information seeking. Hence, a detailed walkthrough was conducted. The 
walkthrough involved systematically analysing and comparing each stage in the 











Eight generic activities Elaboration 

Starting Refers to activities characteristic of the initial search 
for information on a new topic 

Chaining Refers to following chains of citation connections 

“between materials 

Browsing Conducting semi-structured or semi-directed 
searching in an area of potential interest 

Differentiating : Concerned with using characteristics/differences in 


information sources to filter the amount of 
information obtained 


Monitoring Maintaining awareness of developments in a field 
through regular monitoring of particular sources 
Extracting Concerned with systemically working through a 
particular source to selectively locate materials of 
interest 
Verifying Checking the accuracy of information Table I. 
Ending Activities charecteristic of information seeking The behavioural model of 


found at the end of a topic or project information seeking 
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“creative process” with “activities” in the behavioural model of information seeking for 
similarities. When similarities between stages and activities were found, a link was 
established. The four common links established and rationale behind each link is 
described below: 


(1) 


(2) 


(4) 


Preparation in the “creative process” (Figure 3) is concerned with collecting all 
necessary information to facilitate creativity. Thus, this stage is linked to 
starting, chaining, and browsing in “information seeking” (Table I, rows 2-4) as 
these behaviours are also concerned with collecting information. The rationale 
behind this link is based on the fact that objectives of preparation, starting, 
chaining, and browsing are concerned with gathering information. The only 
difference is their contexts. For example, preparation is concerned with 
gathering information to facilitate creation while starting, chaining and 
browsing are concerned with gathering information through means of 
searching, following chains of references and browsing, respectively, to 
address an information need. 


Incubation in the “creative process” (Figure 3) involves unconsciously solving 
the problem. Here, a creative person is unconsciously making linkages among 
information and filtering occurs to facilitate the linking process. Hence, 
incubation is mapped to differentiating in “information seeking” (Table I, row 
5). The rationale for this link is that while an information seeker filters 
information gathered, he/she may be subconsciously making linkages among 
the filtered information and is working towards solving the information need. 
Hence, there should be a link between incubation and differentiating. 


Illumination happens when an idea strikes (Figure 3). The possible information 
seeking behaviours that facilitate this process include monitoring developments 
and extracting relevant information (Table I, rows 6-7) from sources. The 
rationale behind this link is that monitoring developments in a field helps an 
information seeker address the problem or create a product by providing 
background information about the latest developments in a field. Hence ensures 
that he/her addresses the problem uniquely or creates a unique product. 
Extracting is also linked to illumination because it is during extracting that the 
information seeker is systemically working on the information problem so that 
it can be addressed. 


Verification in the “creative process” (Figure 3) involves checking the accuracy 
of information and working the idea into a communicable form. Thus, 
verification in the creative process is linked to verification and ending activities 
in information seeking (Table I, rows 8-9). The rationale for this link is due to 
similarities between verification in the creative process and ending in 
information seeking. Verification is similar to ending because when the 
information need is addressed, the information seeker may likely use the 
information gathered to solve a problem or create a product. Hence, the process 
of solving, creating, and gaining the new understanding may be documented in 
some communicable form. Moreover, verification in the creative process is also 
linked to verification in information seeking because checking the accuracy of 
information and ideas may be necessary before the idea is worked into a 
communicable form. 





Common links identified between the “creative process” and the “information seeking 
activities” provide a basis for identifying stages in creative information seeking. Hence 
hereinafter, these links are renamed to describe “stages” in a creative process for 
information seeking. 


Using scenarios to refine stages in creative information seeking 

A further examination of the stages established previously highlighted that many 
information seeking behaviours were grouped to each stage. Hence, unique 
characteristics of the different information seeking activities are not highlighted 
explicitly. For example, in stage 1, preparation in the creative process was linked with 
starting, chaining, and browsing activities in information seeking. Preparation, 
starting, chaining, and browsing are grouped together because all are concerned with 
gathering information. However, by grouping them in this manner, unique ways in 
which information is gathered in starting, chaining, end browsing are not highlighted 
clearly. Thus, the stages were refined through the use of two information-seeking 
scenarios. These scenarios helped better relate stages in creative information seeking 
to the context of information seeking behaviour, taus, facilitating refinement of these 
stages. The two scenarios (denoted as scenarios A and B) were selected from Theng 
(2002) and Blandford et al. (2001), respectively, which reported real world occurrences 
of information seeking behaviour. Fictitious names have been given to the information 
seeker in each scenario to give the scenario more character. 

Scenario A: directed search task. This scenario depicts the information seeking 
behaviour of Peter, an experienced web user, who was trying to accomplish a directed 
search task. To complete the task, Peter needs to find an article by Ben Shneiderman in 
Networked Computer Science Technical Report Library (NCSTRL) (Theng, 2002). 

In order to find the article by Ben Shneiderman, Peter typed in “Shneiderman” in the 
“search all bibliographic fields” and selected “sort by author”. The system did not 
return any results because of server problems so he reloaded the page and tried again. 
This time many results were returned but he could not find the article he wanted so he 
clicked on “Ben Shneiderman” hoping to be brought to a page with a listing of all Ben’s 
articles. However, an error occurred which Peter thought was due to server problems. 
He then clicked on “search collection” in the navigation bar and typed the title to 
execute the search again. This time he was successful in finding the article. 

Scenario B: open-ended, complex search task. This scenario depicts the information 
seeking behaviour of a first year PhD student, Robin. In this scenario, Robin is trying to 


complete an open-ended, complex search task. In order to complete the task, Robin needs 


to obtain at least one paper on his research topic that could help with his literature review 
using his choice of digital libraries from a given set (Blandford et al., 2001). 

Robin was interested in looking for materials on growing cell structures (GCS), text 
classification and self organising maps (SOM). He started by accessing Emerald 
library, but switched to Ingenta, when his first few queries returned no results. Results 
from Ingenta were promising but overwhelming. Robin saved several abstracts and 
printed cne full text article from Ingenta. Robin went back to Emerald library as he felt 
he might have made some mistakes previously. Previously, he searched Emerald 
library using GCS as the keyword. This time he tried searching using GCS in full text 
and found something. He conducted more searches using Emerald, Ingenta, and NZDL 
(New Zealand Digital Library), usually getting no matches or too many to cope with. At 
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the end of his search session, he found some relevant materials on text classification 
but none on GCS and SOM. 


Evaluating stages in creative information seeking 

With the help of the two information seeking scenarios, dimilai and differences 
between stages in the creative process and activities in information seeking become 
more evident. When there were similarities between stages in the creative process and 
activities in information seeking, a stage in creative information seeking was 
established. However, when no obvious similarities were found between the creative 
process and activities in information seeking, concepts were kept separate to form 
unique stages in creative information seeking. 

In other words, in stage 1, preparation in the creative process may only be mapped 
to starting in information seeking. This is because based on the information seeking 
scenarios, it seemed that preparation and starting were both concerned with initiating 
the creative and information seeking process, respectively. On the other hand, chaining 
and browsing in information seeking should be kept separate, as they were not 
specifically related to preparation in the creative process. Based on the information 
seeking scenarios, chaining and browsing seemed to be concerned with gathering 
information through different means. 

Evidence provided by the information seeking scenarios also seemed to suggest the 
presence of stages in creative information seeking. Hence, mappings between the 
creative process and the information seeking activities were refined to establish the six 
stages in creative information seeking. Terminologies for these stages are from the 
creativity and information seeking models used, hence showing that these stages are 
grounded in established creativity and information seeking models. 

The following paragraphs illustrate the established six stages for creative 
information seeking and rationale for each stage which are grounded in the new set of 
mappings between the creative process and information seeking activities. The new set 
of mappings was refined and supported by extracts of the two information seeking 
scenarios presented previously. Extracts of the scenarios are presented in text boxes. 

Stage 1 - preparation for starting information seeking. This stage was created 
based on the refined mapping between preparation in the creative process and starting 
in information seeking. This new mapping was refined with the help of the following 
extracts of scenarios A and B: 


Scenario A: In order to find the article by Ben Shneiderman, Peter typed in “Shneiderman” in 
the “search all bibliographic fields” and selected “sort by author”... 


Scenario B: Robin was interested in looking for materials on Growing Cell Structures (GCS), 
text classification and Self Organising Maps (SOM)... 


The above extracts show that Peter and Robin are interested in completing their task 
effectively. In Peter’s case, it is finding the specified article while in Robin’s case it is 
doing some research on GCS, text classification, and SOM. However, both lacked the 
knowledge to complete their respective tasks which triggered their information 
seeking process. Hence preparation in the creative process, which is concerned with 
attempting to solve the information need, should be linked to starting in information 
seeking. 


A 





The above extract and refined mapping became the basis for stage 1 (preparation 
for starting information seeking). During this stage, information seekers recognise a 
knowledge gap which triggers creative information seeking. This gap needs to be 
addressed so that the goal/task can be accomplished. 

Stage 2 — chaining information sources. Raticnale for this stage was informed by 
the following extract of scenario B: 


... Robin saved several abstracts and printed one full text article from Ingenta ... 


Based on the above extract, it can be inferred that as Robin read the relevant abstracts 
and article, he may find other relevant references cited that might be useful. Hence, he 
undergoes chaining to get to these references. The extract also seemed to highlight that 
chaining is a concept unique to information seeking as it helps the information seeker 
find related, relevant materials by chaining references in sources. 

Since the creative process did not seem to highlight any stages that have 
characteristics similar to chaining, chaining in information seeking may not be mapped 
to any stage in the creative process. However, the extract gave an impression that 
chaining is an activity that helped the information seeker gather related information to 
create a product or solve a problem. Hence, chaining should be included as a stage in 
creative information seeking which led to the creation of stage 2 (chaining of 
information sources). In this stage, information seekers aim to find/track related, 
relevant materials to understand the breadth of a topic. 

Stage 3 — browsing and searching. This stage was created based on extracts from 
scenarios A and B. The extracts are presented below: 


Scenario A: ... to find an article by Ben Shneiderman, Peter typed in “Shneiderman” in the 
“search all bibliographic fields” and selected “sort by author”... 


Scenario B: ... He started by accessing Emerald. library, but switched to Ingenta, when his 
first few queries returned no results ... Robin went back to Emerald library as he felt he 
might have made some mistakes previously ... He conducted more searches using Emerald 
library, Ingenta, and NZDL (New Zealand Digital Library), usually getting no matches or too 
many to cope with... 


In the above extract of scenario A, Peter did a focused search as he had a clear 
understanding of the topic and task at hand. On the other hand, the extract of scenario 
B highlights that in order for Robin to accomplish an open-ended, complex search task, 
he may first need to understand the breadth of the topic through chaining or some 
general browsing and searching before doing a focused search/browse on the topic. 
The extracts also seemed to indicate that although browsing and searching cannot be 
directly linked to stages in the creative process, browsing and searching are important 
activities as they help the information seeker gather required information so that the 
task can be accomplished. 

Hence, the extracts provided an indication that a browsing and searching stage for 
creative information seeking is needed. This led to the creation of stage 3 (browsing 
and searching). This stage may occur after information seekers have understood the 
breadth of a topic. In this stage, information seekers select a focused topic to begin 
searching and browsing information on that topic. 

Stage 4 — incubation for differentiating purposes. This stage was created based on 
the refined mapping between incubation in the creative process and differentiating in 
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information seeking. The following extracts of scenarios A and B were used to refine 
the mapping: 


Scenario A: ... This time many results were returned but he could not find the article he 
wanted so he clicked on “Ben Shneiderman” hoping to be brought to a page with a listing of 
all Ben’s articles ... 


Scenario B: ... Results from Ingenta were promising but overwhelming ... 


Extracts from both scenarios seemed to indicate that Peter and Robin used their 
personal filtering criteria to help them cope with the overwhelming results returned by 
the system and also to help them determine which sources in the results’ list were 
relevant. The extracts seemed to provide insights that incubation in the creative 
process is linked to differentiation in information seeking. This is because incubation 
involves unconsciously solving the problem and filtering may occur to help the 
information seeker establish linkages towards solving the problem. 

Hence, based on evidence suggested by the extracts and the refined mapping, stage 
4 (incubation for differentiating purposes) was proposed. In this stage, information 
seekers are unconsciously filtering information and establishing linkages to help them 
form the “big picture”. This can be a tedious task as information seekers can become 
overwhelmed by the amount of information gathered in the previous stage. 

Stage 5 — monitoring and extracting for illumination. The following extracts of 
scenarios A and B led to the refined mapping of monitoring and extracting in 
information seeking to illumination in the creative process: 


Scenario A: ... He next clicked on “search collection” in the navigation bar and typed the title 
to enter the search again. This time he was successful in finding the article. 


Scenario B: ... At the end of his search session, he found some relevant materials on text 
classification but none on GCS and SOM. 


Based on the above extract of scenario A, it can be inferred that in order for Peter to 
know if the source in the results’ list is that one he really needs, Peter needs to explicitly 
access the source and look at the document to extract information to determine if it is 
indeed the correct one. As for the above extract of scenario B, it can be inferred that in 
order for Robin to know how to complete his literature review (that is, illumination), he 
needs to go through relevant materials found to extract parts that are useful for him. 
Moreover, since a literature review is concerned with showing the progression of 
developments in certain concepts, Robin may also need to do some monitoring of 
concepts highlighted in materials that he has extracted so that his literature review 
becomes more comprehensive. 

The above extracts provided insights that illumination in the creative process 
should be linked to monitoring and extracting in information seeking. This is because 
monitoring and extracting may facilitate illumination by providing relevant 
information while monitoring may also facilitate illumination by ensuring that the 
problem is addressed uniquely and ideas created are exclusive. 

Hence, the refined mapping became the basis for the creation of stage 5 (monitoring 
and extracting for illumination). In this stage, information seekers monitor 
developments in a field and go through sources to pull out-relevant information so 


So 


» 





that they can achieve a personal understanding of the topic and an idea can be 
produced. 

Stage 6 ~ verification of information sources. The rationale for this stage is based on 
the refined mapping between verification and ending in information seeking and 
verification in the creative process. The following extracts of scenarios A and B were 
used to inform the refined mapping: 


Scenario A: . . . He next clicked on “search collection” in the navigation bar and typed the title 
to enter the search again. This time he was successful and found the article ... 


Scenario B: ... At the end of his search session, he found some relevant materials on text 
classification but none on GCS and SOM... 


Based on the above extract of scenario A, it can be inferred that in order for Peter to 
determine if the source is the one he needs, Peter needs to access the actual source to 
extract information about the source. He then needs to verify the extracted information 
with information provided about the specified source. As for the extract of scenario B, 
it shows that Robin had gathered some information on his research topic which he can 
use to complete his literature review (that is, working the idea into a communicable 
form). Moreover based on the extract, it can be inferred that Robin may conduct more 
searching/browsing to ensure that the information gathered is indeed accurate. 

Hence, the extracts seemed to indicate that verification in the creative process can be 
mapped to verification and ending in information seeking because they are all 
concerned with verifying the information gathered and using the information gathered 
to accomplish the task. 

The refined mapping provided the basis for stage 6 (verification of information 
sources). In this stage, information seekers go through a process to ensure that the idea 
is correct. They then work the idea into a communicable form. 


Discussion and conclusion 

Evidence provided by the two scenarios used to conceptualise the creative information 
seeking model seemed to suggest that the type of information seeking task may have 
an impact on the extent to which the information seeker exhibits all stages in the 
model. In other words, this means that depending on the type of task the extent or way 
in which information seekers exhibit proposed stages in creative information seeking 
may be different. For example, in scenario A where Peter was asked to find a specified 
article, evidence provided by the scenario seemed to show that Peter did not go through 
all stages in creative information seeking to complete the task. However, in scenario B 
where Robin was asked to find at least one article that would help in his literature 
review, Robin seemed to have to gone through all stages highlighted in the model to 
complete the task. 

In order to evaluate the observation that the type and complexity of the task may 
have an impact on the extent of creative information seeking exhibited, there is a need 
to conduct empirical studies and gather evidence to examine this observation in more 
detail. Hence, Part II aims to verify and evaluate this observation through two studies. 
The objectives of these studies are to lend support to the six stages in the creative 
information-seeking model and to examine the extent of creative information seeking 
exhibited by information seekers while accomplishing the information seeking task. 
One study will address these objectives using a directed information seeking task 
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while the other study will address similar objectives using an open-end, complex 
information seeking task. 

In this paper, a conceptual framework for six stages in creative information seeking 
has been proposed by synthesizing a holistic model of creativity and an established 
information seeking model. These stages differ from Wilson’s (1981) model of 
information seeking behaviour, Kuhlthau’s (1993) model of the information search 
process, and the behavioural model of information seeking (Ellis, 1989; Ellis et al., 1993). 

The six stages in creative information seeking focus on the creative and cognitive 
processes involved during the information seeking experience while the model of 
information seeking behaviour (Wilson, 1981) focuses on how information needs arise 
as a result of a person’s needs and environment. Kuhithau’s (1993) model of the 
information search process, on the other hand, is concerned with the feelings, thoughts, 
and actions associated with each stage of the process. The behavioural model of 
information seeking (Ellis, 1989; Ellis et al., 1993) focuses on eight generic activities in 
information seeking and recognises that the sequence of these activities is dependent 
on an information seeker’s context and circumstances. 

Proposed stages in a creative process for information seeking need to be further 
refined, tested, and used in real-world situations before they can emerge as stages and 
principles for the design of IR systems to support users’ creative information seeking 
behaviours. In Part II of this paper, methodologies and analyses of two empirical 
studies will be presented and discussed. The aims of these studies are to lend further 
support to the creative information seeking model and to gather evidence to evaluate 
the observation that the extent of creative information seeking exhibited by an 
information seeker is dependant on the task. 
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Abstract 

Purpose ~ This paper aims to make a substantial contribution to the ongoing debate about the 
potential of open access publishing and institutional repositories to reform the scholarly 
communication system. It presents the views of senior authors on these issues and contextualises 
them within the broader framework of their journal publishing behaviour and preferences. 
Design/methodology/approach — A highly representative online opinion survey of more than five 
and half thousand journals authors, building on an earlier January 2004) benchmarking study carried 
out by CIBER. 

Findings — Senior researchers are rapidly becoming more informed about open access publishing 
and institutional repositories but are still a long way off reaching a consensus on the likelihood that 
these new models will challenge the existing order, nor are they in agreement whether this would be a 
positive or a negative development. Disciplinary culture and, to a less extent, regional location are key 
determinants of author attitudes and any policy response should avoid “one-size-fits-all” solutions. 
Research limitations/implications — This survey reflects the opinions of senior corresponding 
authors who have recently published in a “top” (i.e. ISI-indexed journal) with 95 per cent confidence. 
The findings should not be generalised to represent the views of all authors in all journals, open access 
or otherwise. > 

Originality/value — The journal publishing sector is facing enormous challenges and opportunities 
as content increasingly migrates to the web. The value of this research is that it provides an objective, 
non-partisan, assessment of the attitudes and opinions of more than 5,000 senior researchers, a key 
stakeholder group, and thus contributes both to the development of public policy as well as more 
realistic commercial strategies. 


Keywords Communication, Journal publishers, Publishing, Surveys 
Paper type Research paper 


Introduction Emerald 


This paper constitutes the first findings[{1] of an international survey of journal authors 
which was commissioned by the Publishers Association (PA) and the International 4615, proceedings: New Information 


Association of Scientific, Technical and Medical Publishers (STM), with additional Perspectives 
support from CIBER associates, early in 2005. This follows on the success of a previous ita a poe 
study of author attitudes and opinions in regard to scholarly communication, which © Emerald Group Publishing Limited 


CIBER carried out in 2004 (Rowlands et al., 2004a, b; Nicholas et al., 2005a). As withour pa ro. 1n0eeoorzssosioaiess 
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earlier study this survey was conducted independently and we had complete freedom 
as to the questions we asked and how we interpreted and reported the data. 
The specific objectives of this study were to: 


* accurately chart the growing awareness of open access concepts among the 
author community and to explore their attitudes to new publishing models, 
including open access and institutional repositories; 


- place this in the context of their use and perceptions of the traditional publishing 
system; and 


* determine how much had changed since early 2004. 


This was to be undertaken in a manner that would enable us to make valid statements 
about the position in regard to all subject fields and regions. 


Methods 

The survey instrument, an online questionnaire and covering e-mail invitation to 
participate, were designed by CIBER and extensively piloted over three rounds using 
correspondents from a variety of disciplinary backgrounds from Australia, India, 
Mexico, France, Greece, the USA and the UK during May and June 2005. The research 
design was based on closed questions to facilitate its roll out on a very large scale using 
web technologies. The report does, however, make extensive use of some of the 
unprompted free text comments made by authors at the end of the questionnaire to 
underline the variety of opinions and the complexity of some of the issues raised (these 
unprompted comments will be the subject of a separate article). 

The survey was administered on CIBER’s behalf by NOP World. Authors were sent 
an e-mail message inviting them to take part in the survey. The message contained a 
hyperlink to NOP’s database, enabling them to link direct to the online questionnaire. 
The survey went live on 21 June 2005. Reminders were sent out on 1 July and the 
fieldwork concluded on 6 July when we had exceeded our target number of completed 
responses. 

Author mailing lists were commissioned from the Institute for Scientific 
Information (ISI) to CIBER’s specification: 100,000 randomly selected authors who 
had published in an IS]-indexed journal during 2004. The lists were supplied direct to 
NOP who subsequently removed any duplicate names. NOP has a number of clients in 
the publishing sector who also engage in web-based survey research. To avoid 
over-exposure of these methods, NOP removed from our original lists the names of any 
authors who had recently been invited to take part in such a survey. After the removal 
of duplicate names and the removal of authors who had recently been contacted by 
NOP for research purposes, 76,790 e-mail invitations were sent out and 5,513 were 
returned. The response rate (7.2 per cent) was high for a web-based industrial survey, 
most of which tend to cluster in the 4-6 per cent band. However, the response rate begs 
the question, common to all surveys, of whether there are any systematic differences 
between the invited and responding populations? 

The completed questionnaires were generally representative of the broad regional 
categories that ISI uses (Rowlands and Nicholas, 2005). In terms of regional response, 
Asian authors were under-represented, by 9 percentage points, possibly due to the fact 
that, for reasons of cost, the survey was administered in the English language. It may 
well be that the Asian response rate was affected both by language and character set 
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issues. It is worth bearing in mind, however, that the absolute response rate still 
reflects the views of 1,280 individual Asian authors. The final sample ordered a very 
accurate reflection of authors’ subject interests (Rowlands and Nicholas, 2005). We 
conclude that the final survey sample is highly representative by subject discipline but 
that there is a small but significant shortfall in Asian authors. For this reason, we are 
conservatively basing our estimate of the power of the survey on the actual response 
rate obtained for Asian authors. On that basis, we estimate that the findings reported 
in this report should be interpreted within the context of a 95 per cent confidence limit 
and within a confidence interval of + 2.7 per cent. 


Results 
Authors were first asked some broad and contextual questions to determine their 
views and behaviour in regard to journal scholarly communication; specifically: 

* what factors were involved in choosing where they published; 

* what they thought of the peer review process; 

* their views on the key topics of information overload, journal pricing, journal 

specialisation, journal metrics; and 
* how they found journal articles of interest. 


This was done in order that open access and institutional repositories, topics too often 
treated in isolation, could be seen and evaluated in the light of the much bigger picture 
of journal publishing. We then move in on to a detailed consideration of issues and 
attitudes to do with open access publishing and institutional repositories. 


Deciding where to publish 

In answering this question, authors were asked to relate their responses to a recent 
critical incident: the last article they had published. In regard to choosing where to 
publish, various factors were rated on a scale where 5 = very important, 1 = not at all 
important (Figure 1). The responses revealed the central importance of the prestige of 
the outlet, as indicated directly by the journal’s reputation (mean response = 4.50) or, 


Averages, where 5 = Very important, 1 = Not at all important 


Reputation of the journal 
Readership 

Impact factor 

Speed of publication 
Reputation of editorial board 
Online manuscript submission 
Print and electronic versions 
Permission to post post-print 


Permission to post pre-print 





Permission to retain copyright 


Note: n = 5,513 
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by proxy, by its impact factor (4.04). Practical issues such as the nature of the 
readership for that journal (4.21) and speed of publication (8.89) were also relatively 
highly rated. It seems from these results that the population as a whole does not attach 
much importance to the issues of being able to retain their copyright in the article, nor 
to gaining permission to place a pre- or post-print on the web or in some kind of 
repository. 


Peer review process 

Authors were again asked to relate their responses to a critical incident: the last article 
they had published. On the whole, authors’ experience of the peer review process was 
highly positive, with 77.0 per cent of respondents agreeing that the referees’ comments 
on their last published paper had been helpful and had improved their work (Figure 2). 
This sets a useful benchmark for future surveys. There were, however, major 
variations within this overall picture: interestingly, authors in physics and astronomy, 
a community which works within a very open, collaborative, information culture and 
the first to embrace pre-print servers in a big way, was the least positive about their 
experience of the formal system. 

The scholarly community attaches enormous value to the function of peer review in 
regulating the quality of what is published: 96.2 per cent of our respondents indicated 
that this aspect is “very” or “quite important”. The unsolicited comments that came at 
the end of the survey revealed that, love it or loath it, effective peer review mechanisms 
were fundamental to scholarly communication. However, a reading of the author 
comments indicates that all is not well and there is widespread dissatisfaction, 
especially regarding the time reviewers take over manuscripts, the evident lack of 
care they often take, sometimes even doubting their qualifications for the task. 
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Several respondents complained of the power that top journal editors and reviewers 
have to make or break careers and many suggested that more open forms of peer 
review, perhaps making comments publicly available on the web, would help to 
address this issue. The vexed question of whether reviewers should have to forego 
their anonymity, again in the interests of greater transparency was another common 
theme. 

Not surprisingly, it was found that authors who have had positive recent experience 
of the peer review system in practice were more likely to agree that it is important in 
principle. More surprising, perhaps, is the finding that positive attitudes toward peer 
review appear to be age-related and to decline gradually towards the latter part of 
authors’ careers. Within the sample, there was significant regional variation in the 
importance authors attached to peer review: North Americans gave this factor the 
highest priority, Asian authors the least. 


Views on current “hot” issues to do with the journals system 

In this section, attention shifts from a previous critical incident (the last paper 
published) to a more general exploration of authors’ views on the current state of 
journal publishing, focussing on some key or contentious areas: information overload, 
journal pricing, journal specialisation, and journal metrics. 

Respondents were offered a range of statements (Figure 3) and asked to rate each on 
a 5-point scale, where 5 = very important, 1 = not at all important. Although they 
were presented in a randomised order, these statements did in fact form matched pairs 
and so they will be discussed in that context. 

Information overload. Figure 4 juxtaposes responses to the statements “Too much 
research is being published” and “I publish more than I ought to” and finds an 
interesting disjunction. One might reasonably expect there to be little difference 
between authors’ views on these two statements (the “null hypothesis”) but in fact 
there is a highly significant statistical difference between the two profiles and only 
8.5 per cent of respondents agreed (“strongly” or “a little”) with both statements. 
Perceptions are key here: most people feel insecure in the face of the rapid growth of the 
literature yet, as successful authors in highly filtered top quality journals, they may 
quite rightly feel that they are not personally to blame for the information explosion. 


Averages, where 5 = Very important, 1 = Not at all important 


I publish for fellow specialists 

High prices makes access difficult 
Downloads are good research indicator 
Citations are a good research indicator 
Too much is published 

Journals are too specialised 

I publish in affordable journals 

I publish more than I ought 








Note: n = 5,513 
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Figure 4. 
Attitudes towards 
information overload* 
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2,326 

















Strongly agree Agree a little Neither Disagree a little Strongly disagree 
m Too much research is being published O I publish more than I ought to 


Note: n = 5,513; Chi square = 521.11, d.f. = 25, p < 0.001 


Many authors commented that too much low quality (i.e. non peer-reviewed or scantily 
reviewed) materials were being published, especially over the internet. A persistent 
theme in author comments is the degrading effect of the “publish or perish” culture 
on academic discourse. Indeed, one researcher suggested that it was this, i.e. the 
“publish or perish” mentality, rather than journal pricing or arguments over publishing 
models, that is the biggest challenge facing the sector during the current journals 
“crisis”. 

Journal pricing. Figure 5 compares authors’ responses to the statements “High 
journal prices make it difficult to access the literature” and “As an author, I deliberately 
publish in journals that are affordable to readers”. Again, there is a strange split of 
opinion here, with only 20.7 per cent of respondents agreeing with both propositions. 
Intriguingly, a small but sizeable minority (12.8 per cent) disagreed strongly that high 
prices were a barrier to accessing the literature. Biomedical sciences were unusually 
highly represented in this group. ; 

This finding exemplifies the much discussed “Jekyll and Hyde” nature of academics 
as authors and academics as readers and suggests that their needs are very different in 
these two contexts. Many authors spoke of the instrumental influence of external 
measures, like impact factors, in determining where they felt they have to publish, 
sometimes to the detriment of their readers. There is also the issue that most academics 
are shielded from the workings of the marketplace (and journal pricing is hardly as 
transparent as it might be) and from any reasonable knowledge of journal prices. 
However, as in the previous question, the lack of integration of author views appears 
odd to say the least (imagine if the scenario were “Do you agree that unprotected:sex 
may lead to sexually transmitted diseases?” and “Do you use a condom?”). 
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2,460 

















Strongly agree Agree a little Neither Disagree a little Strongly disagree 


m High prices make it difficult to access the journals literature 
O I publish in affordable journals 


Note: n = 5,513 


Unsurprisingly, high journal prices are perceived as a much greater problem, in terms 
of being able to access the literature, by younger authors and by those based in Africa 
and Eastern Europe. 

Journal specialisation. Figure 6 shows the pattern of author reaction to the 
statements “Journals have become too specialised” and “As an author, I choose 
journals that will be read by the specialists in my field”. This time, 32.5 per cent of 
authors agreed with both statements, although there was absolutely no consensus with 
respect to the first statement and a massive majority that agreed strongly with the 
second. This pair of findings suggests that researchers were strongly wedded to the 
“narrowcasting” functions of the traditional journal, where considerable value added 
derives from the niche marketing of these products into specialised communities. 

Journal metrics. Finally, we consider the issue of how credible senior researchers 
found two controversial indicators of the usefulness of research: (author) citations and 
(reader) downloads. This time, there is a clear consensus, with 61.5 per cent of 
researchers agreeing that both were useful in this context (Figure 7). Note that the 
question explored the measurement of utility rather than “quality” but this is 
nonetheless a surprising finding — given the relative infancy of the download metric, 
and it may indicate that download metrics would have considerable credibility 
amongst the author community. Alternatives to the traditional impact factor, based on 
article downloads and modelled using the same time windows as are used to construct 
impact factors might offer a very interesting and worthwhile direction for future 
research and development: they would certainly be of greater appeal to librarians and 
many publishers. 
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Discovering journal articles 

Figure 8 shows the very considerable reliance that researchers now place on a wide 
variety of electronic tools to identify journal articles that are relevant to their needs: 
clearly they were availing themselves of the huge choice available. However, some 
things do not change and the single most important mechanism was still “chaining” 
from one document to others that may be useful by following up cited references 
(mean = 4.06). This is clearly a strong and entrenched behavioural characteristic and 
one which the publishing industry is supporting by adding value to digital libraries 
through cross-linking. Despite, possibly because of, the march of full-text, abstracting 
services still remain fairly popular (3.51). And this finding is further borne out by log 
studies we have conducted on the publisher platiorms of Emerald and Blackwell 
(Nicholas et al., 2005b). 

The convenience and speed of electronic tools contrasts with the role of the physical 
library (2.37), now ranked in 11th position out of 12. In the light of this finding, clearly, 
libraries need to consider their position/visibility in a digital world where their users 
are removed from them and not even conscious they are the ones who pay the access 
bills. A number of respondents commented on how electronic media had contributed to 
big improvements in their ability to identify and use journal information. 


Open access publishing 

Knowledge of open access publishing. In this section, we benchmark authors’ 
self-reported knowledge of open access journals July 2005) with the results of an 
identical question answered by 3,787 researchers in January 2004 (Figure 9). Even 
though the data are collected from different populations, the methodology was 
identical and the confidence intervals for the two surveys small enough to conclude 
that a significant shift has indeed occurred within the space of 18 months: the research 


Averages, where 5 = Very dependent, 1 = Not at all dependent 
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Other web search engine 
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Figure 9. 
Knowledge of open access 
publishing 


Percentages 


























Nothing at all A little Quite a lot Alot 
m 2004 © 2005 


Note: 2004 n = 3,787, 2005 n= 5,513 


community is now much more aware of this issue. The biggest rise has been in authors 
knowing “quite a lot” about open access (up 10 percentage points from the 2004 figure) 
and the biggest fall in authors knowing nothing at all (down 25 points). At the same 
time, around a fifth of the author population still claimed to “know nothing at all”. 
The regional breakdown on this question reveals substantial variation in terms of the 
level of knowledge of open access, with Eastern Europe, Asia and Africa showing 
much higher levels of awareness than North America or Western Europe. In addition to 
these regional variations, there are significant differences relating to institutional 
affiliation and subject discipline. 

Experience of open access publishing. A surprising finding of this survey was the 
dramatic growth in authors who said they have actually published in an open access 
journal since the last survey (Figure 10). This has grown considerably from 11 per cent 
January 2004) to 29 per cent July 2005) and represents a major shift in behaviour, 
albeit from a low base. Authors who had published in an open access journal are more 
likely to attach a lower value to the importance of peer review. 

The impact of publishing. We provided authors with a working definition of. open 
access journals: 


Open access journals use a funding model in which researchers are able to read, download, 
copy, distribute and print articles and other materials free of charge from the Internet. Open 
access publishers sometimes meet their costs by charging authors (usually through the 
author’s funding body or employer), for the publishing services they provide. In other cases, 
open access journals are run by researchers themselves and the publishing costs are absorbed 
by their employers. 


We then provided a list of statements intended to elicit their views on the kinds of 
outcomes that they might expect in an open access world and asked them to rate them 
on our 5-point scale (Figure 11). Authors strongly believed that articles will become 


+ 


ý 


Sp, 








Numbers of respondents 


Yes, I publish OA whenever possible 509 


Yes, but OA is not a major issue for me 908 


No 3,474 


I don’t know 62 








Note: n = 5,513 


Averages, where 5 = Strongly agree, 1 = Strongly disagree 


Articles will be easier to obtain 

Libraries will have more money to spend 
Authors will publish more often 

Fewer articles will be rejected 

Articles will become longer 

Archiving will suffer 

Authors will have less choice where they publish 








The quality of articles will improve 


Note: n = 5,513 


more accessible (mean = 4.10) and somewhat less strongly felt that libraries would 
have more to spend (8.52). They do not believe, however, that quality will improve 
(2.63). This question was also used in the previous CIBER survey, but using a 4-point 
rather than a 5-point scale. The rank orderings of the two surveys are identical, so it 
looks as though the findings shown in Figure 11 are consistent and robust. 

How disruptive is open access. We also offered the statement: “A major shift to open 
access publishing would undermine the current scholarly journals system” and 
solicited researchers’ views as to how likely they felt this would come to pass. 
This statement clearly polarised views, although there was a clear majority of 
authors believing it would prove undermining. Of those who expressed an opinion, 
49.5 per cent believed this was likely, 28.3 per cent believed it was unlikely and 
22.7 per cent adopted a neutral position. 
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We then asked a rider: “To what extent do you think this would be a good thing 
or a bad thing?” Of those who expressed an opinion, 41.0 per cent said it would be a 
“good thing”, 20.1 per cent a “bad thing”. 


Research funding 

This question was designed to provide much-needed data on the extent to which 
recently published articles are based on work backed by a grant or research contract. 
This aspect of the “author pays” business model has been largely ignored in the 
literature and recent surveys, which is very surprising given its significance. Our 
research shows that only 40.1 per cent of respondents were in receipt of external 
funding for all of their papers over the past three years; almost a third (82.3 per cent) 
had published most of their work without such support (i.e. with 50 per cent or fewer, 
or none, of their papers having been funded). 

As might be expected, there was considerable regional variation with respect to this 
question, with African authors being the least likely to have a majority of their papers 
grant-supported. Arts and humanities, mathematics and the social sciences ‘are of 
course the disciplines which report the lowest proportion of directly funded papers. 
When we focus on the major science and technology disciplines represented in the 
survey, it becomes clear that there is still enormous variation, with medicine and allied 
health comprising the category with the smallest proportion of funded papers. 
These findings strongly suggest that the “author pays” business model needs to be 
introduced with some caution: it is evidently not a panacea outside of certain niche 
markets where it has shown to be very successful. 


Who should meet publishing costs? 

So we naturally turn to the question of who should meet the costs of publishing 
scholarly articles, the question that is at the heart of the current debate about 
alternative business models (Figure 12), 

Well, as far as authors are concerned, the answer appears to be “as far away from 
meas possible”! There is little enthusiasm then for author or reader facing charges, and 
a feeling that libraries should not have to make such a large contribution to journal 
costs as they do at the moment. The favoured option is that a greater burden should be 
borne (in this order) on research funders, commercial sponsors and central government. 

However, as we have already seen, there are problems with a solution based on 
scaling publication costs to research overheads in that many authors prefer, or have no 
alternative, but to publish on their own or their employer’s resources. The issue of 
commercial sponsorship is an interesting one, and several respondents made the point 
of the need for greater use of advertising revenues (the problem here being that this 
proposal is unrealistic given the highly specialist nature of many journals) and other 
respondents made the point that commercial sponsorship could pollute academic 
discourse. 


Institutional repositories 

Figure 13 compares authors’ self-rated knowledge of institutional repositories with the 
results of the same question for open access journals reported earlier. Researchers’ 
awareness of this model is currently very limited: only 9.7 per cent declared that 
they had “a little” or “a lot” of knowledge, compared with 30.3 per cent for open 
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access journals. As for open access, there is considerable geographic variation with 
respect to levels of knowledge of institutional repositories with Asian authors claiming 
to be the most informed, Western European authors the least. 

Authors were asked about their attitudes to populating institutional repositories 
(Figure 14). There appears to be more people willing than unwilling, but a substantial 
minority (38.1 per cent) of authors who expressed an opinion, declared their 
unwillingness to depositing articles in an institutional repository. 
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Figure 14. 

Author attitudes to 
populating institutional 
repositories 


Figure 15. 
Author attitudes to 
version control issues 
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Note: n = 5,513 


The inevitable outcome of repositories will be that there will be a number of versions of 
an article in existence. We thus asked researchers to respond to the statement: “Are you 
happy that, under an institutional repository model, readers would be able to retrieve 
several different versions of your articles (for example, the “official” version of, your 
paper on the publisher’s web site, together with one or more pre-publication versions on 
public web sites)?” The results (Figure 15) reveal no consensus at all on this issue, but 
generally there was more unhappiness than happiness. 

In the same vein, we wished to determine how disruptive authors thought 
institutional repositories would be. We thus asked respondents to respond to the 
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statement: “A major shift to archiving published articles in institutional repositories 
would undermine the current scholarly journals system”. Although there was no 
consensus on this issue, it appears that researchers perceive repositories (Figure 16) to 
be slightly less of a threat to the traditional journals system than open access journals. 
Authors also appeared to be less positive about institutional repositories than they are 
about open access journals and see them as less of a “good thing” (Figure 17). 


Conclusions : 

We know a good deal about the views of pubtishers and librarians in regard to the 
workings of the scholarly communication system, although a good deal of it is skewed 
by territorial ambitions. However, as far as we are aware this is the largest, most 
representative and statistically robust study undertaken into the views of authors on 
the workings of the scholarly publishing system. It is also a study conducted in a 
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Figure 16. 
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totally unbiased fashion; the research team (CIBER) have no allegiances other than to 
the data. Taken together with the 2004 study, this study provides policy makers with 
an enormous data resource upon which they can build sound policies: policies which 
have an enormous impact on the whole ressarch community, not to mention a 
nationally important industry (publishing). To date they have largely had to rely on 
small, unrepresentative studies bereft of statistical robustness. In this vacuum 
dangerous policies have been developed and are being implemented with gusto 
(this happens in a data vacuum). Now surely is the time to take stock on the back of the 
copious evidence that the author community (early 10,000 of them) has only too 
happily shared with us over a period of two years. 
The key evidence to note is: 


* In determining where to publish, the author population as a whole did not attach 
much importance to the issues of being able to retain their copyright in the 
article, nor to gaining permission to place a pre- or post-print on the web or in 
some kind of repository. 


* The crucial importance of peer review to the health and welfare of scholarly 
publishing. 

+ Most authors felt insecure in the face of the rapid growth of the literature yet, as, 
by definition, successful authors in highly filtered “top” quality journals, they felt 
that they were not personally to blame for the information explosion. 


* The “Jekyll and Hyde” nature of academics as authors and academics as readers 
emerged in many of their responses but no more strongly than when comparing 
authors’ responses to the statements: “High journal prices make it difficult to 
access the literature” and “As an author, I deliberately publish in journals that 
are affordable to readers”. There was a strange split of opinion here, with only 
20.7 per cent of respondents agreeing with both propositions. 


* Significantly, senior authors and researchers found (reader) downloads credible 
in terms of determining the usefulness of research. 


* Chasing up references in papers still remained the most popular method for 
discovering journal articles of interest. The convenience and speed of electronic 
tools was highly appreciated but the role and importance of the physical library 
merits serious reflection — libraries were ranked in 11th position out of 12 methods. 
Clearly, libraries need to consider their positicn/visibility in a digital world where 
their users are removed from them and not even conscious they are the ones who 
pay the access bills. We have found in our log studies that usage depends highly on 
visibility and libraries are facing real problems maintaining their visibility in an 
increasingly digital information world (Nicholas et al., 2005a). 

* Jnregard to open access two significant shifts appear to have occurred since the last 
survey. Firstly, the research community is now much more aware of the open access 
issue. There has been a large rise in authors knowing quite a lot about open access (up 
10 percentage points from the 2004 figure) and a big fall in authors knowing nothing 
at all about open access (down 25 points). Secondly, the proportion of authors 
publishing in an open access journal has grown considerably from 11 per cent (2004) 
to 29 per cent; This figure is possibly an overestimate as Elsevier researchers (Amin, 
2005) have discovered that 65 per cent of authors who claimed to have published in 
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an open access journal had in fact published in a “traditional” journal. They 
determined this by asking respondents who said they had published in an OA 
journal to list the last OA journal they published in. Examining the list revealed that 
in 65 per cent of cases these titles were not in fact OA journals. The explanation for 
this is thought to be because most authors’ access to journals is free (or open). 


+ Authors strongly believed that, as a result of open access, articles will become 
more accessible and, somewhat less strongly, that libraries would have more to 
spend. They do not really believe, however, that quality will improve. 


* There was a clear majority of authors believing that open access would prove 
undermining for scholarly publishing. Of those who expressed an opinion, half 
believed this was likely; however, a good proportion of these people thought this 
would probably be a good thing. 


* There was little enthusiasm for author-or reader facing charges, and a feeling 
that libraries should not have to make such a large contribution to journal costs 
as they do at the moment. The favoured option was that a greater burden should 
be borne (in this order) on research funders, commercial sponsors and central 
government. There were, however, dangers here because we found that almost a 
third of authors had published most of their work without external funding 
(i.e. with 50 per cent or fewer, or none, of their papers having been funded). 


* Authors were not very knowledgeable about institutional repositories — less 
than 10 per cent declared that they had “a little” or “a lot” of knowledge and there 
was a significant dragging of feet with a significant percentage (38 per cent) of 
those expressing an opinion, declaring their unwillingness to depositing articles 
in an institutional repository. 


Note 


1. The full report, questionnaire, etc. and news of other publications emanating from the study 
can be found at www.ucl.ac.uk/ciber/ 
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Abstract 


Purpose — The purpose of the study was to compare an internet search engine, Google, with 
appropriate library databases and systems, in order to assess the relative value, strengths and 
weaknesses of the two sorts of system. 

Design/methodology/approach — A case study approach was used, with detailed analvsis and 
failure checking of results. The performance of the two systems was assessed in terms of coverage, 
unique records, precision, and quality and accessibility of results. A novel form of relevance 
assessment, based on the work of Saracevic and others was devised. 

Findings — Google is superior for coverage and accessibility. Library systems are superior for quality 
of results. Precision is similar for both systems. Good coverage requires use of both, as both have 
many unique items. Improving the skills of the searcher is likely to give better results from the library 
systems, but not from Google. 

Research limitations/implications — Only four case studies were included. These were limited ta 
the kind of queries likely to be searched by university students. Library resources were limited to those 
in two UK academic libraries. Only the basic Google web search functionality was used, and cnly the 
top ten records examined. 

Practical implications — The results offer guidance for those providing support and training for 
use of these retrieval systems, and also provide evidence for debates on the “Google phenomenon”. 
Originality/value — This is one of the few studies which provide evidence on the relative 
performance of internet search engines and library databases, and the only one to conduct such 
in-depth case studies. The method for the assessment of relevance is novel. 

Keywords Academic libraries, Search engines, Information retrieval 
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Introduction 

Without doubt, the internet, and specifically the world wide web, has transformed the 
information environment in the past decade, providing more rapid access to a greater 
volume of material than possible at any earlier time. Searching tools, though far from 
perfect, have played a major part in this transformation. One of these tools, the Google 
search engine, has become predominant, to the extent that “to Google” had become de 
facto a verb in the English language by mid-2003, despite the objections of the 
company (Quint, 2002; BBC, 2003). 

Google is, therefore, representative of the variety of easy-to-use search engines, 
based on free-text searching of the content of public web pages. It is indeed their major 
representative, given the mission of the company “to make all the world’s information 
available” (Library Journal News, 2003). The extension of the “basic” Google search 
function into Google scholar (providing access to non-copyright academic material 
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(Tenopir, 2005), Google print (searching the digitised full text of printed books, from 
publishers, booksellers or libraries, and allowing the viewing of a small extract of 
copyright material) (Fialkoff, 2005), and other ventures, suggests that this may not bea 
wild ambition. 

While these engines have indisputably made much information searching quicker 
and more efficient, they have also led to the belief that all information is t0 be found 
there, and retrieved without undue effort: “library patrons expect to find it all in 
cyberspace ...for the purposes of academic research, such expectations are unrealistic 
and even dangerous” (Lawrence and Miller, 2000, p. 1). In turn, this leads to a dismissal 
of any other sources of information, specifically of libraries and the formal information 
sources which they provide: 


In less than a decade, Internet search engines have completely changed how people gather 
information. No longer must we run to a library to look up something; rather we can pull up 
documents with just a few clicks on a keyboard. Now ... “Googling” has become synonymous 
with doing research (Mostafa, 2005). - 


Web search engines, and Google in particular, have created a generation of searchers 
who are choosing the simplicity of search engines on the open free web over the 
perceived complexity of library services. Libraries can no longer cater for “people who 
want fast, easy access to unlimited, full-text content using interfaces that require no 
critical thought or evaluation” (Bell, 2004). Fast and Campbell (2004) found that 
students “admired the organisation of (an) OPAC, but preferred to use the web in spite 
of its disorganised state”. 

It is, of course, inevitable that convenient access to information which, while it 
may not be comprehensive or of the highest quality, is good enough will be alluring. 
This is a natural human impulse, codified by Zipf into his principle of least effort 
(Sole and Cancho, 2003) and by Simon in his concept of “satisficing” (Tennant, 2001; 
Agusto, 2002), not to mention the complaint of some in the library/information area 
that we live in a society “fuelled by a culture of instant gratification” Gtoffle et al, 
1996, p. 219). 

According to Bell (2004), “Google has become the symbol of competition to the 
academic library”. He uses the term “infobesity” to compare the way students now 
search for information with the modern consumption of fast food. Originally coined 
by James Morris, the Dean of the School of Computer Science at Carnegie Mellon 
University, “infobesity” refers to the belief that searching Google for information 
provides a junk information diet. Bell believes that students often want to find 
something quickly and that they are generally not concerned about the quality. It is 
clear from a review of the numerous articles published on this subject, that there is 
a general belief in the library community that more “nutritious” information can be 


retrieved by using the specialised databases available in an academic library. There . 


are debates as to the amount of information available through systems such as 
Google compared to the “hidden web” of library databases (Tenopir, 2004; Herring, 
2001; Devine and Egger-Sider, 2004), as well as of concerns about the quality of 
material retrieved (perhaps uncritically) from a search engine (Herring, 2001; 
Tennant, 2001). 

The study reported here aimed to investigate, by detailed analysis of a small 
number of case studies, to what extent it is true that Google can supplant library 
databases, for typical queries of the sort likely to be researched by students. The study 
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was carried out as a Masters Degree dissertation, and full details may be found in the 
dissertation (Brophy, 2004). 


Purpose and scope of the study 
The purpose of this study was to compare the performance of Google to that of library 
database services in answering queries of the sort likely to be asked by students. In 
answering this question, it was hoped to gain an understanding of the optimal 
performance of both types of service, and of their relative advantages and 
disadvantages. By “library database services” we mean online catalogues and 
bibliographic databases, both general and specialised. 

Although there have been numerous comparative evaluations of databases, library 
systems, and web search engines (for reviews of these studies see Brophy, 2004; Xie, 


2004) relatively few have attempted this sort of direct comparison of two kinds of 


service. 

One example is that of Xie (2004), who compared online database systems 
(Dialog and Factiva) with three different types of web search tool (search engine, 
directory and a meta-search engine). Students were asked to search the same. two 
topics on each system and then required to give relevance scores, with precision 
of each system calculated on the total retrieved relevant documents (to a 
maximum of 20). Another is that of Fast and Campbell (2004), who examined 


the perceptions of students searching Google and a university library OPAC, 


using interviews, verbal reports and observations. A third is that of Griffiths 
and Brophy (2005), who report two studies of the use of Google and of various 
academic information resources, finding a predominant use of internet search 
engines. These examples illustrate the various tools which may be used to 
investigate a complex situation. 

The scope of this project is restricted on the one hand to the basic Google web 
search function, and on the other to services provided by “typical” academic libraries in 
the UK. No attempt is made to deal with other internet search tools, or other Google 
services, nor the services provided by other library sectors. 

The study involves four test queries, which are analysed in detail by a single expert 


searcher. While this number of queries is not enough to allow a claim that the results 


are applicable to all situations, the depth of qualitative analysis should allow a good 
understanding of the issues. 


Methodology 
Only the main points of the study methodology are described here: full details may be 
found in Brophy (2004). 

A case study approach was adopted, with both quantitative and qualitative aspects. 
Quantitative results allowed an assessment of recall, precision, overlap and similar 
factors, while qualitative results allowed inclusion of ideas of quality and value of 
information (Bawden, 1990). 

Using a small number of cases, each with a test query searched by the investigator, 
allowed a detailed study of the documents retrieved, and of the reasons for their not 
being retrieved from each service. This led to a detailed understanding of the reasons 
for the relative performance of the services. 
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The queries used were open-ended “research based” questions (Bilal, 2001), rather 
than closed reference style queries, since these allowed the performance of the services 
to be tested to best effect. 

To avoid possible investigator bias in the choice of topics, four general subject areas 
were assigned by a librarian, unassociated with the study, as typifying appropriate 
domains. These were: environmental science, music, education, and law. Specific 
queries suitable for these subject domains were then chosen from a list of queries used 
in an online searching assignment. These were intended to mimic typical “real” student 
queries. 

The test queries are shown in detail in Table I. It may be noted that the topics are as 
likely to be covered in web documents as in library database records, and that 
specialised vocabularies, while they may be available, are not required. In this sense 
they are a “fair test”, offering scope for both services to perform well. 

The test queries were then searched on Google, and on databases and resources 
deemed suitable in an appropriate academic library. Different libraries were used 
because the library at City University London, at which the study was carried out, has 
little material on environmental sciences because of the subject profile of the 
university. The resources of Kingston University library were used for this query. 

Each of the Library search sessions involved using an average of 14 queries to 
interrogate a variety of IR systems. Whilst some specialist subject databases such as 
RILM abstracts of Music Literature and Westlaw were used in only one task, a 
selection of more general sources (such as Factiva, Web of Knowledge, and Ingenta 
Journals) were used in all four tasks. 

To create a more “natural” searching environment the researcher undertook all 
query formulation via direct interaction with each system. This allowed for goals 
and search strategies to change with every stage of the search process. As a 
result, no restriction was placed on the number of searches in’ each session. For 
practical reasons, however, the total session length for each task was restricted. 
All systems for a given query were tested in the same day, with each session 
lasting no longer than two hours. After this time period, the results of each search 
were saved and stored in order to allow evaluation to take place at a later stage. 
This was intended to avoid the possibility of pages being unavailable at a later 
stage of the evaluation, resulting in them being judged unfairly as inactive 
documents. 

It was necessary to address four issues in analysing the results: the quality, 
relevance and accessibility of the documents retrieved, and the coverage of these 
documents in the systems tested. 

Quality was assessed using Robinson’s (2000) framework, which considers: 

Context: 

e Relevance Is the resource suitable for its intended use/users? 

e Authority Who is responsible for the resource? Are they qualified? 

ə Provenance How stable is the resource? What is its lifespan? 

e Objectivity Does the resource provide a balanced, evaluated viewpoint? 


Content: 

e Currency Is the information up-to-date? Will it be updated regularly? 

e Accuracy How accurate is the resource? 

e Coverage Is the resource comprehensive? What is the subject coverage? 
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Table I. 
Test queries 


Test query 1 
Domain 
Topic 

Task 


Description 


Narrative 


Concepts 


Test query 2 
Domain 
Topic 

Task 


Description 








Law 

Internet piracy 

The internet has changed the way people obtain music. Does file 
sharing hurt the music industry, and can record companies take legal 
action against illegal down-loaders? Can any additional measures be 
taken to deter potential “pirates”? 

Documents should identify information regarding the current trend for 
downloading music from the internet. Documents should resolve any 
number of the following questions: 

How has the internet changed the way people obtain music? 

Is the music industry losing revenue? 

Is downloading music illegal? 

Is so, what legislation is in place? d 

Does the current legal protection need to be changed? 

What measures can be taken immediately to deter internet “pirates”? 
Information is required which covers the legal aspects of the situation 
and possible changes which could make file-sharing an acceptable 
practise to all involved 

Relevant documents will be suitable for an information science student 
studying a “Law Module”. They should provide 
information/discussion of the primary concepts detailed below. 
Documents should be in English and be suitable for a student whose 
main subject is not law. Documents aimed at law professionals will not 
be suitable 

Intellectual property 

Piracy or pirate 

Ilegal downloading 

File sharing 

Legislation 

Legal implications 

Napster 

Music industry or music recording industry 

Record companies 

Digital copyright 

Peer-to-peer file sharing or P2P file sharing 

Copyright infringement 

Intellectual property 

Online music 


Environmental Science 
Endangered Animals 
How many species of animals are in danger of extinction? What 
could/should be done? What are the scientific/ethical issues? 
Documents should resolve any number of the following questions: 
How do you define an endangered animal? 
How many are endangered? 
Why are they threatened? 
What conservation steps are being taken and what more can be done? 
What legislation exists? á 
What are the scientific and ethical issues? 

(continued) 














Narrative 


Concepts 


Test query 3 
Domain 
Topic 

Task 


Description 


Narrative 


Concepts 


Is Google 
Relevant documents will be suitable for an undergraduate h? 
Environmental Science student. They should provide information enougn: 
regarding the concepts listed below. The student has a general 
understanding of environmental and biological terms but is by no means 
a specialist in this area. As such a variety of documents covering general 
background issues through to more focused analysis (which is not too 
technically demanding) will be required. In addition this student has not 503 
previously studied any law sa information regarding legislation should 
be of a suitable level. Documents regarding endangered plants, insects, 
crustaceans and fish are not considered relevant. Book reviews are not 
considered relevant unless they contain a significant amount of data 
Endangered species or threatened species 
Animals or mammals or wildlife Population 
Conservation or environmental management 
Environmental legislation 
Human impact or human activities effects or Environmental impact 
Environmental monitoring 
Biological diversity or biodiversity or ecological balance 
Habitat protection or loss of key habitat 
Ethics or moral concepts or moral values 
Scientific information or environmental information 





Education 

Children’s reading habits 

Has the success of Harry Potter encouraged children’s reading in 
general? If so, is this likely to continue? If not, why not? 

Documents should identify information regarding the impact of J.K. 
Rowling’s books on children’s reading. Information is required which 
covers the popularity of the Harry Potter books with children, changes 
in the reading habits of children, the causes of any such changes and 
any action being taken towards improving the situation 

Relevant documents will be suitable for a student studying Educational 
Theory. They should provide information/discussion of the primary 
concepts detailed below. Documents about teaching “methods” 
{teacher’s notes, etc.), book/film reviews are not relevant. Documents 
should be in English and be aimed at undergraduate level studies 
Harry Potter 

Children’s reading habits 

Rowling, Joanne K. 

Children’s books 

Harry Potter books 

Children’s literature 

Media habits of children 

Good reading habits 

Reading fundamental 

Effect of internet/computer games/television on children’s reading 
English literature twentieth century 

English literature twenty-first century 


Music 
Opera 
(continued) Table L 
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Table I. 


Task Describe the literary, musical and historical background to Gounod’s 
opera, “Faust”. How does this particular work fit into its historical 
context? What recordings of the work are available? 

Description Documents should identify documents which provide some answer to 
any of the following questions? 

What is the cultural, social and historical background to the opera? 
What literature is the opera based on? 

What period of history was: Gounod writing in? 

How does this work fit into its historical context? 

What recordings of this opera are available? 

Which is considered to be the best? 

Narrative Relevant documents will be suitable for a first year Music student. 
General musical terminology is understood but the student has little 
knowledge of opera and its history. Details of individual productions 
are not relevant unless historically important or relate to specific 
recordings. Similarly singer biographies and university syliabus’ are 

: not considered relevant 

Concepts Faust or Faustus 
Gounod 
Opera 
Literature 
Goethe 
Marlow or Marlowe 
Recordings 
Historical context 
Composer 
Musical period 
Musical style 
Opera (nineteenth) century 
French lyric opera 
Structure and style 


Relevance was assessed by the investigator. For each query within the four search 
sessions using Google, only the top ten hits were judged for relevance, in keeping with 
the level of evaluation in previous studies of this type (Schwartz, 1998). 

Each retrieved document was judged for relevance using the three levels used by 
Chu and Rosenthal (1996): relevant, partially relevant and not relevant (Table ID). 
The following novel relevance framework — based upon the work of Saracevic (1996), 
Barry and Schamber (1998) and Greisdorf (2003) — was adopted as the method for 
judging the degree to which each document was considered relevant: 


* A document is considered topical if it matches the subject matter of any aspect of 
the query. 

* A document is considered pertinent if it can be considered to be informative, that 
is to say if it contains substantive information, rather than just, for example, a 
list of resources. 

¢ A positive mark for utility means that the coun is considered useful for 
satisfying the information need. 


It is possible that a document could be rated “Y” for utility, even though it is rated “N” 
for topicality. This would be an example of serendipity: something not matching, the 
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question posed, but useful, for some unanticipated reason, in the supposed user 
context. (This did not occur in this study.) 

Accessibility. Documents from Google were considered accessible if the link to them 
was active. Documents from other services were considered accessible it full-text 
access was available either instantly, on demand through a service such as Athens, or 
if the resource was easily accessible in hard-copy format. The latter was taken to mean 
that it was available in a library available to the researcher, and within reasonable 
travelling distance. 

Coverage was assessed by carrying out a failure analysis, as a way of identifying 
relevant documents which are present in a system but have not (for whatever reason) 
been retrieved, augmenting the list of retrieved items, following the method of 
Robinson et al (2000). 

“Recall” was not assessed, due to the large and unstructured nature oz the search 
space, which makes traditional recall measures infeasible. Recall measurements were 
substituted by measures of the coverage of useful items in each system. 


Results 
A summary of results only is given here. A fuller treatment is given in Brophy (2004). 

A total of 723 documents were retrieved over all the searches for the four test 
queries. Of these, 237 from Google, and 163 from library systems, were judged relevant 
according to the framework above. 

The results are summarised below under the criteria: retrieval; precision; quality; 
overall quality; accessibility; coverage; failure analysis; and uniqueness. 

Figure 1 shows the percentage of retrieved documents within each of three 
relevance categories. 

Figure 2 shows the precision scores (calculated in the usual way, as number of 
relevant retrieved items in ratio to total retrieved items) for the two different systems in 
the four respective subject domains. The results indicate that neither system 
consistently provides a more precise set of results. In three out of four cases the results 
were similar, suggesting that the top ranked results from a Google search have similar 
relevance to results obtained using Library systems. 

Figure 3 shows a summary of the results of the Quality assessments conducted on 
the results of all four tasks. Overall, 52 per cent of Google results were found to be good 
quality whilst library systems recorded a figure of 84 per cent (Figure 4). This seems to 
indicate that if good quality results are required, it is better to use library systems. 
However, only a very small percentage of top-ranked Google results (4 per cent) were 





Document number Topicality Pertinence Utility Relevant Partially relevant Not relevant 


1 Y Y Y X 

2 Y Y N X 

3 Y N N X 
4 Y N Y X 

5 N N N X 
6 N N Y X 
7 N Y Y X 

8 N Y N X 
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Table II. 
Framework for judging 
relevance 
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poor quality. Only two broken links were found when evaluating the four Google 

document result sets, indicating the system is updated frequently in order to remove 

any bad links. Google should not, therefore, be dismissed as providing consistently š 
poor quality results. 
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The presumed ability of search engines like Google to provide immediate access to 
full-text material is confirmed in the results for accessibility, which consistently shows 
Google providing immediate access to over 90 per cent (over 97 per cent in three out of 
four tasks) of its relevant results (Figure 5). 

The results for the library systems were more varied than for Google, with 
65 per cent of the total results being found to be immediately accessible. In search 
sessions 2-4, a significant number of documents had to be sourced via the British 
Library or via an Inter-Library Loan Service (LL). This would undoubtedly be 
off-putting to many students due to the time and costs involved in accessing the 
information. Whilst the numbers are low (less than 10 per cent in three out of four 
cases), relevant good quality documents were ultimately found which were unavailable 
even in the British Library collections. Overall, almost 35 per cent of the relevant 
library documents were found to require either an ILL or be inaccessible altogether. 

, Virtually all Google documents were immediately accessible or not accessible at all, 
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m Accessible With Difficulty 
o Not Accessible 
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Figure 4. 
Overall quality of relevant 
documents 
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Relevant documents by 
accessibility 
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Figure 6. 
Coverage 


Figure 7. 
Failure to retrieve relevant 
documents 





due to broken links: a small number were accessible with difficulty, in that the link 
address had changed, and had to be identified by searching. 

The coverage of the systems over the four tasks is shown in Figure 6, which displays 
the percentage of useful items which are indexed by each system for each of the four 
tasks. Google’s coverage is better in each case. It is clear from these results that there is a 
degree of overlap in the documents obtained from the two systems. In three out of four 
cases the total useful coverage can be seen to be greater than 100 per cent, demonstrating 
that a proportion of documents in each case are indexed by both systems. 

In order to understand the reason certain documents were not retrieved, though 
present in the system, the search queries used in each session were assessed (Figure 7). 
For Google, 31 per cent of the additional documents could have been found by 
improving the terms used when constructing the search queries. However, for almost 
70 per cent of documents, Google required the specific title of the document to be 
queried. For the library, a more balanced result was apparent with 50 per cent of 
documents being sourced on improving the search query and 50 per cent requiring 
very specific query terms. This result shows a large proportion of results which would 
have been extremely difficult to retrieve, irrespective of the skills of the searcher. It also 
suggests that the improvement of searching skills, by training programmes, for 
example, is more appropriate to library systems rather than Google. 

Figure 8 shows the proportion of relevant documents which are only indexed by one 
system: “uniques”. 
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Shown as a percentage, the results are fairly high across all eight search sessions, with Is Google 
Task 1 (library) the only session to record less than 60 per cent of its results as unique. enough? 
Overall, Google provided a higher proportion of unique results than library systems, i 
indicating that more library material is indexed by Google than vice versa. 

The main data from these results are summarised in tabular form in Tables II-V. 
These show figures for quality, accessibility and retrieval performance (precision, 
coverage, and uniques), respectively. 509 





Conclusions 

The results of this study may be summarised by saying that the two kinds of resource 
— search engines and library databases — seem to be complementary, as assessed by 
their performance over this small set of test queries. 
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Search Session ee ae EE 
Quality level . Google (mean) (per cent) Library (mean) (per cent) 
Good quality 52 84 Table M. 
Adequate quality 44 16 Overall quality of 
Poor quality 4 0 relevant material (mean) 
Accessibility level Google (mean) iper cent) Library tmean) (per cent) 
Immediately accessible 96 65 Table IV. 
Accessible with difficulty 2 28 Overall accessibility of 
Not accessible 2 7 relevant material (mean) 
Google (mean) (per cent) Library (mean) (per cent) 
Precision 56 54 Table V. 
Coverage 69 44 Overall retrieval 


Unique 83 73 performance (mean) 
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Google appears clearly superior in both coverage and accessibility. Google also gives 
more unique items, though the difference is small, and both types of resource are 
needed if good- coverage is required. 

The library databases are superior for quality of results, though Google’s 
performance here is not unacceptable, when only the top ten results are considered. 
Precision is similar for both systems, again with the caveat that only the top ten Google 
results are considered. The precision is not particularly impressive in either case, being 
less than 60 per cent. 

Intriguingly, improving the skills of the searcher is likely to give better results from 
the library systems, but not from Google. This has implications for user awareness and 
training programmes. It may be seen as a worrying factor, given the tendency 
identified above, and emphasised in the findings of Griffiths and Brophy (2005) for ease 
of use, and by implication lack of need for training, to be the major factor in choice of 
source. If systems like Google are indeed usable to best effect without training to 
improve competence, then it is highly likely that they will be preferred. 

In comparing the systems, in terms of auyantages and dipagang, the 
conclusions can be summarised as follows: 

Google i 


-a high proportion of relevant documents retrieved; 

* an ability to retrieve a fairly precise set of documents; 
* a high proportion of adequate or good quality results;- 
* a high proportion of unique documents; and 

* no problems with accessibility. 


Library databases 
* a moderate proportion of relevant documents retrieved; 
* an ability to retrieve a fairly precise set of documents; 
* a high proportion of good quality results; 
* -a high proportion of unique documents; and 
“* some problems with accessibility. 


The main discriminating factors seem to be quality (favouring library systems) and 
accessibility (favouring Google). Coverage also favours Google, although both systems 
are needed to achieve anything approaching comprehensive recall. 

These conclusions seem to reflect the issues raised at the start, and reflect those 
identified by Griffiths and Brophy (2005). Accessibility is likely (rightly or wrongly) to 
be favoured over quality as a determinant of choice by the student users considered 
here. Lack of comprehensiveness in retrieval is unlikely to be a strong motivator for 
these users to use any retrieval systems in addition to an internet search engine. Nor is 
the prospect of undertaking extra training to make better use of library databases 
likely to be attractive, when this is not useful for Google. 

It may be that, in the medium term, such issues will disappear, as search engines 
like Google — and the more advanced ones which will succeed it (Mostafa, 2005) — 
acquire more “academic” content, and as library system interfaces take on the 
look-and-feel of search engines. In effect, the systems will merge, hopefully 
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encapsulating the best features of both. The difficulties of adapting library systems in 
this way should not be underestimated (Fast and Campbell, 2004): they have, after all, 
been designed with the basic assumption of users having what Borgman (1996) terms 
“a rich conceptual framework for information retrieval”. 

In the shorter term, there may be a role for information specialists, in their capacity 
as facilitators of information literacy, in helping their users to appreciate the 
limitations of all available systems, and suggest strategies to overcome them. This 
could also include some understanding of the nature of academic information and its 
structuring. While it is perhaps unreasonable to expect a typical student to gain 
Borgman’s conceptual framework in its entirety, some realistic understanding of the 
world of information may be a realistic aim. 
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Purpose — This study provides insights into the early market for e-books in the UK through survey 
research with members of a large online panel. 

Design/methodology/approach — Data were collected from an online panel established by a 
leading commercial internet research company. Members of the panel are signalled each week to take 
part in web surveys. Respondents completed an online questionnaire posted on the company’s web 
site. Questions explored awareness, trialling, purchase and borrowing of e-books, examining the 
frequency of such behaviour and types of publications accessed and/or obtained. 

Findings — A significant proportion of respondents (85 per cent) were aware of e-books. Among these 
respondents, around half (49 per cent) had made trial use of them, nearly four in ten (38 per cent) had 
bought at least one e-book, and one in seven (13 per cent) had borrowed an e-book from a library. 
Technical books and non-fiction publications related to hobbies and interests were among those most 
popularly used and bought. The main perceived advantages of e-books are that they car: be obtained 
more conveniently than going via a bookstore and they are often cheaper than hard copy versions. 
Research limitations/implications — This online survey was dependent on respondent 
self-selection. This meant that there was no central control over the return sample profile. 
Originality/value — This survey provided an early look at the e-book market in the UK. Findings 
indicated the market potential of e-books given that the equipment needed to read them is regarded 
neither as too expensive nor too difficult to use. It is clear, however, that early e-book users regard 
electronic reading as something to use primarily for reference work than for more extended reading for 
leisure and entertainment. Most e-book users (56 per cent) still preferred not to read extended passages 
of text from a screen. Nonetheless, for dipping in and out of reference works e-books have the 
advantage of being easier to search and easier to annotate. 


Keywords Internet, Electronic publishing, United Kingdom, Electronic books, User studies 
Paper type Research paper 


Introduction 
The growth of information and communications technologies has heralded dramatic 
developments in the publishing world. Electronic publishing has emerged as a major 
growth phenomenon in a rapidly evolving media landscape. The emergence in 
particular of the internet has opened up many fresh opportunities for dissemination of 
content to large and small consumer markets. Among these is the use of new 
technology to deliver books to readers. E-publishing can move much more rapidly than 
traditional publishing and as the technology for reception penetrates the consumer 
marketplace, it is likely to become the major force in publishing in the future. Aslib Proceedings: New Information 
When Stephen King published his novella titled Riding the Bullet in electronic Perspectives 
format, and made it available only via the internet, more than 500,000 people tried to = ax pee 
download it within one week of its publication. At a cost of $2, this electronic nove] © Emerald Group Publishing Limited 
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for a traditional paper novel. Cost factors and convenience of access and storage are 
likely to surface as key drivers of the e-publishing market. This applies to readers and 
publishers. Getting the right business model through which to sell e-books is as 
important as the actual price for consumers. When King subsequently released The 
Plant online in June 2000, readers were invited to buy it in instalments. Within a few 
months, however, King stopped posting new chapters because less than half its readers 
bothered to pay for it (Hartman, 2002). 


What is an e-book? 

An e-book, in the broadest sense, comprises any book or monograph-length sequence 
of text made available in electronic form. According to one writer, in a strictly technical 
sense, “an electronic book, or e-book, is the presentation of electronic files in digital 
displays” (Romano, 2001, p. 3). As any internet user knows, the world wide web 
contains a huge quantity and variety of electronic text forms. Increasingly, though an 
e-book is becoming more precisely defined in terms of text that can be read via the use 
of e-book software and hardware. Indeed, the term e-book is sometimes used to refer 
specifically to such reading equipment (Ormes, 2001). 

As with all new communications technology developments, various tools have been 
developed to enable e-book reading. Although there have been early attempts to 
establish a standard software for e-book reading, such as the openbook standard 
(OEB), Adobe’s PDF is a competitive brand. In this respect, the early e-book market 
resembles the early video-recorder market with competing VHS and Betamax formats 
and the early satellite television market with the competing Sky Television and British 
Satellite Broadcasting dish technologies. Eventually, one standard format emerged as 
dominant and the same may be true for e-publishing. 


At present, the e-book market is in its infancy. Early adopters have begun to utilise . 


this format, but many readers are still reluctant to abandon paper books for books held 
on an electronic reader. Nonetheless, as new mobile communications technologies 
evolve apace and have becoming fashion accessories as much as functional devices, so 
too are e-book readers starting to establish a trendy reputation driven not only by the 
efforts of popular authors such as Stephen King, but also by their recognition in the 
popular media as must-have fashion items (Writers Write, 2005). 

E-books are now published by a large number of organisations. By 2001, Romano 
claimed that more than 100 e-publications providing e-books for on-screen reading 
could already be found upon searching the internet (Romano, 2001). That number has 
grown many times over since then. The author accessed just three directory sites on 
6 May 2005, and found over 150 different e-book sites listed (www.computercrowsnest. 
com, www.business.com, and www.see-search.com). There were many more directory 
sites located that provided still further such links. 

Many new publishing companies have sprung up specifically to produce e-books 
and to distribute them over the internet. Mainstream publishing houses have also 
recognised the importance of having a presence in this new market. While online 
publishing ventures such as Stephen King’s have served to draw wider attention to the 
phenomenon, there are many lesser known publishing houses that now operate 
exclusively on the web. E-books can be bought through the major online booksellers 
such as Amazon and Barnes and Noble, and also through specialised e-book sellers. 
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The consumption of e-books can only be achieved with the proper equipment. There 
are a number of electronic devices on the market that can be used to read e-books. 
Handheld readers such as some portable computers, personal digital assistants (PDAs) 
and palmtops can be adapted to read e-books, by having the appropriate software 
installed. There are dedicated. e-book readers that can also be carried around. They do 
not have the range of functionality found in PDAs or palmtops, but they often offer 
larger screens, more memory and can be easy to use. Finally, desktop personal 
computers can have e-book reading software installed, which then effectively converts 
them into e-book readers. With the right software e-books can be downloaded from the 
web. 


Who buys e-books? 

The growth of e-book publishers and the migration of mainstream publishers into 
online publishing is testament to the burgeoning consumer market. The e-book 
customer base can be defined in terms of types of content they consume and the 
reading devices they use (Spiselman, 2001). The most popular genres have been 
identified as non-reference textbooks, science fiction and romance novels. The 
principal reading devices of choice are the PC notebook and the Pocket PC (Spiselman, 
2001). 

One advantage of e-books is that they are easier to carry around than ordinary 
paper books (or p-books). An e-book reading device may be able store dozens or even 
100s of e-books and with each new e-book that is loaded on, the device itself does not 
become any heavier. 

Publishers of academic texts have been swift to recognise the potential of e-books. 
Most mainstream academic publishers produce electronic publications. These 
publications include journals as well as textbooks and specialised monographs. The 
Joint Information Systems Committee (JISC) in the UK maintains a collections portfolio 
that constitutes a major electronic resource for the higher education and further 
education sectors. The academic library at JISC, for example, offers collections from 
Pluto Press and The Electric Book Company with other publishers due to join. This has 
nearly 300 academic book titles covering subjects such as anthropology, cultural and 
media studies and politics. The JISC web site provides links to many 100s of other 
online publishers and publications (www.jisc.ac.uk). 


Is there a big market for e-books? 

Since the turn of the millennium, there have been a number of forecasts regarding the 
e-book market, some of which have been highly optimistic, while others have 
expressed more cautious views. Romano (2001, p. 10) noted that, “Andersen projected 
the e-book market at $2.3 billion by 2005 — 10 per cent of the estimated $21.9 billion 
consumer book market in 2005”. Spiselman (2001), for example, forecast that the 
American e-book market would grow dramatically, being worth $3.1 billion by the end 
of 2004 and over $25 billion by 2008, having reached a value of $100 million in 2000. 
This compared with a p-book market worth $25 billion in 2000 and likely to be worth 
$50 billion by 2008. Hence, proportionately, the e-book market would grow from being 
worth 0.4 per cent of the p-book market in 2000 to being worth 50 per cent of the p-book 
market by 2008. Within ten years, the e-book and p-book markets were forecast to be 
about equal in size. 
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In contrast, other forecasts indicated that the print book market was not about to be 
supplanted by e-books for some time. Mayfield (2001) reported a study by commercial 
new media research agency Forrester Research that envisaged e-books as forming a 
small part of the total publishing market for some years to come. In the United States, 
e-book sales were forecast to achieve around $251 million in 2005, which contrasts 
significantly with Spiselman’s forecast of growth to $3.1 billion by 2004. Barriers to 
growth were limited content available electronically, differing reader formats and poor 
screen resolution for reading. Although the same report indicated that other analysts, 
such as Accenture, predicted a US market for e-book consumption that would reach 
$2.3 billion by 2005 and 28 million people likely to acquire dedicated e-book readers 
(Mayfield, 2001). 

Market forecasting can be an imprecise science. More confidence can be placed 
in actual sales figures. Such figures have indicated significant periods of growth 
for e-book sales in the US. The Association of American Publishers reported that 
the e-book segment grew from £211,000 in net sales in January 2002 to just over 
$3.3 million in January 2003. This 1,447 per cent growth was easily the biggest 
year-on-year increase of any category of publishing (Blough, 2003). 

Writenews.com (2003) reported increased e-book retail unit sales of 40 per cent and 
increased e-book titles available through retailers of 144 per cent year-on-year between 
the first half of 2003 and the same period in 2002 in the US. Revenue growth over this 
period reached 30 per cent. E-book publishers increased titles published over this 
period by 45 per cent and their revenues also increased by 29 per cent. Macworld (2004) 
reported that e-book sales represented the publishing industry’s fastest growing sector. 
Sales in the first quarter of 2004 increased 46 per cent and revenues increased 28 per 
cent year-on-year. 

Further evidence of the potential of e-publishing has derived from corporate use 
of e-books. The OECD publishes much of its work in book form and experienced a 
downturn in its printed book sales during the 1990s. It launched an online 
bookshop in 1998 initially to sell printed titles. It subsequently switched to 
publishing e-books on a pay-per-view basis. These were PDF versions of its 
printed editions made available at 80 per cent of the print price and offered free of 
charge to people purchasing the print editions. This new arrangement lifted the 
OECD’s book sales. By 2000, 20 per cent of sales from its online bookshop were 
for e-books only. To overcome access problems for some users, particularly those 
going through libraries, the OECD adopted an e-journals model whereby e-books 
were located on Ingenta’s journal platform. In this field, each book was treated as 
a journal issue with chapters identified as “articles”. By mid-2003, the OECD had 
made available about 1,500 e-books in English and a further 1,000 in French. With 
this new business model, it experienced dramatic growth in online users and 
numbers of books being downloaded between 2001 and 2003 (Green, 2004). 

In essence, the e-book industry has to some extent mirrored the fortunes of the wider 
dot-com sector. Following an initial period of expansion in the 1990s, the early boom 
was succeeded by a more conservative period in which capital investment eased off 
and e-book companies focused on reducing their overheads while waiting for the 
market to become more stable and consistently profitable (Woodward and Edwards, 
2001). As the communications technology environment has evolved, the opportunities 
for e-book distribution have grown, opening up bigger potential consumer markets. 
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What is needed, however, is a greater understanding of the e-book consumer. Sales 
trends may be indicative of market potential, but without knowing what e-book 
consumers are looking for and what they want from their e-reading publishers may 
lack vital information that can pinpoint where the initial growth areas can be found 
and how best to exploit them. The study reported in this paper does not pretend to 
have all the answers, but it does represent an attempt to seek feedback directly from 
internet users about their e-book experiences. 


Methodology 

New research is reported in this paper based on an online survey of e-book use 
and experiences. The survey. was conducted with an established online panel[1]. 
The panel comprises over 20,000 registered members who are polled about various 
internet-related issues on a regular basis. The survey reported here was posted on 
23 May 2005, and the data are based on responses received up to the end of May. 
A respondent base of 3,916 was achieved. The response sample displayed a female 
bias (64 per cent females; 36 per cent males}. The age distribution was: 18-22s — 
10 per cent; 25-34s — 29 per cent; 35-44s — 28 per cent; 45-54s — 21 per cent; 
55-64s — 9 per cent; and 65 + s — 2 per cent. Data were weighted back to known 
UK population parameters. 

An initial filter question asked respondents whether they had ever heard of an 
e-book. Those who answered “yes” were routed through the remainder of the 
questionnaire and those who answered “no” completed the personal details questions 
at the end before exiting(2]. 

Among respondents who had heard of an e-book, further questions asked whether 
they had ever accessed an e-book on a trial basis on a web site. Those respondents who 
had done this were then asked what type of publication they had accessed from the 
following list: dictionary, encyclopaedia, short fictional story, full-length novel, 
technical manual, specialised research monograph, academic textbook, popular 
non-fiction — autobiography/biography, popular non-fiction — related to hobby or 
interest, and other. Respondents were then asked if they had ever purchased an e-book 
and if so how many times. They were also questioned about the type of e-book they 
had purchased using the earlier list. Further questions asked of those who had ever 
purchased an e-book included the price they had paid for it, whether the beok was also 
available in hard copy form and their main reason for making the purchase. 
Respondents were provided with the following list of possible reasons for purchasing 
an e-book: “it was cheaper than the paper version of the book”; “it is easier to search the 
content”; “you can electronically annotate it”; “it offers a multi-media format”; “it was 
more convenient than going to a bookstore on foot to buy it”; “it is easier to carry 
around”; “it is easier to store”; “ I just like using the latest gadgets and technologies for 
everything I do”; “I thought it might be easier to make copies for friends”; and “I also 
own the hard copy but wanted to see what an e-version would look like”. 

The questioning then turned to borrowing of e-books. Respondents were initially 
asked whether they had ever borrowed an e-book from a library using an online link. 
Those who said they had done this were then routed through further questions on this 
topic. They were asked how often they had borrowed e-books and whether they 
preferred borrowing e-books or hard copy books, or both the same. 
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All respondents who had made any use of e-books were asked about the type of 
equipment they used for storing and reading e-books. The choices included: desktop 
reader installed in a standard PC; PDA; palmtop; or dedicated e-book reader. 

E-book users were also asked to indicate any problems they had experienced with 
e-books. Again a list of options was provided and respondents could endorse any that 
applied in their case. These options were: “the technology you need for e-books is 
expensive”; “the technology you need for e-books is difficult to use”; “I don’t like 
reading long text extracts on a screen”; “screen resolution can make reading difficult”; 
and “there are limited titles available in electronic form”. 


Findings 

The online survey attracted 3,916 responses in total. In response to the initial filter 
question of whether respondents had ever heard of an e-book, 85 per cent said that they 
had. These amounted to 3,322 respondents who formed the base for all subsequent 
analyses reported in this paper. The data are reported as general frequencies for the 
whole sample and independently for gender and age groups where statistically 
significant variations in responding between these groups occurred. 


Ever accessed an e-book on trial basis 

Questions about use of e-books began by asking respondents whether they had ever 
accessed an e-book on a trial basis. There was an almost even split between 
respondents who answered yes (49 per cent; n = 1,610) and no (51 per cent; n = 1,709) 
to this question. There was, however, a significant difference between males (56 per 
cent; n = 668) and females (44 per cent; n = 942) in the gender distribution of 
respondents who reported trial access (X? = 43.8, df = 1, p < 0.001). 

Of respondents who indicated that they had accessed an e-book on a trial basis 
(n = 1,627), a further question asked them to indicate what type of book this was on 
the last occasion that they had accessed a book online. Ten response categories were 
presented here. Overall, of the respondents who answered this question, the most 
frequently accessed e-books were technical manuals (42 per cent; n = 672), popular 
non-fiction related to a hobby or interest (82 per cent; n = 510), a short fictional story 
(25 per cent; n = 399), a dictionary (23 per cent; n = 378), an encyclopaedia (21 per 
cent; n = 343), full-length novel (21 per cent; n = 343), or an academic text book (20 per 
cent; n = 326). Less frequently trial accessed e-books were specialised research 
monographs (13 per cent; 209) and popular non-fiction — autobiography/biography (10 
per cent; n == 156). A further 16 per cent of respondents (n = 252) endorsed the “other 
e-book” category. 

Male respondents were more likely than female respondents to trial access a 
technical manual (51 per cent; n = 339 vs 35 per cent; n = 330; X? = 39.6, df = 1, 
p < 0.001); a dictionary (28 per cent; n = 185 vs 20 per cent; n = 192; X? = 11.7, 
df = 1, p < 0.001); an encyclopaedia (27 per cent; n = 180 vs 17 per cent; n = 162; 
X? = 22.2, df = 1, p < 0.001); and an academic textbook (23 per cent; n = 155 vs 18 
per cent; n = 171; X? = 6.2, df = 1, p < 0.01). 

Younger respondents aged 18-24 were also significantly more likely than older aged 
groups to access a dictionary (35 per cent; n = 55; X? = 20.7, df = 5, p < 0.001) and 
an academic text book (31 per cent; n = 48; X? = 31.3, df = 5, p < 0.001). 
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Ever purchased an e-book 

All respondents who had heard of an e-book were asked if they had ever purchased 
one. Nearly four in ten (88 per cent; n = 1, 262) said that they had done so. Males (42 
per cent; n = 278) were more likely to have purchased an e-book than were females (36 
per cent; 2 = 337) among this sample (X?=5.7, df =1, p < 0.02). Of those 
respondents who indicated that they had ever purchased an e-book, a further question 
asked them how many times they had done this. Of those who replied to this question, 
one in three (30 per cent; n = 140) said this was a one-off behaviour, over four in ten (42 
per cent; n = 194) said they had done so two or three times, one in six (16 per cent; 
n = 75) had done so between four and nine times, and over one in ten (11 per cent; 
n = 52) had done this ten or more times. 

Those who said they had purchased an e-book were asked to indicate what type of 
book their last purchase had been. The same book categories were used here as for the 
trial access question. Among those who replied to this question, the most popular 
purchases were a technical manual (39 per cent; n = 238) or popular non-fiction related 
to hobby or interest (34 per cent; n = 207). Other nominated purchases were a 
full-length novel (18 per cent; n = 111), an academic textbook (15 per cent; n = 95), a 
specialised research monograph (14 per cent; n = 87), a short fictional story (13 per 
cent; n = 77), a dictionary (8 per cent; n = 49), encyclopaedia (8 per cent; n = 48), and 
an autobiography or biography (7 per cent; 2 = 43). One in five respondents (21 per 
cent; n = 129) endorsed the “other e-book” category. 

Males significantly outnumbered females in respect of purchases of technical 
manuals (49 per cent; n = 137 vs 30 per cent; n = 100; X? = 24.7, df = 1, p < 0.001), 
dictionaries (12 per cent; n = 34 vs 5 per cent; n = 15; X? = 12.6, df = 1, p < 0.001), 
and encyclopaedias (13 per cent; n = 35 vs 4 per cent; n = 13; X? == 16.1, df = 1, 
p < 0.001). 

When asked how much they had paid for their last e-book purchase, over half the 
respondents who responded (56 per cent; n = 341) indicated it cost them less than £10. 
One in four (26 per cent; n = 158) paid between £10-20, and nearly one in five (19 per 
cent; n = 112) paid more than £20. Female respondents were significantly more likely 
than males to have paid more than £10 for an e-book (51 per cent; n = 167 vs 37 per 
cent; n = 103; X? = 11.8, df = 1, p < 0.02). 

E-book buyers were asked whether the e-book they had last purchased was also 
available in hard copy. Of those who responded, one in three (33 per cent; n = 207) said 
that it was available also in hard copy, while nearly four in ten (38 per cent; n = 236) 
said that it was not. The remainder (28 per cent; n = 174) did not know. 

E-book buyers were then given a list of possible reasons to endorse relating to their 
e-book purchases. They were asked to indicate the main motivation or reason for 
buying an e-book. Ten different reasons were provided. Among those who answered 
this question (n = 206), the three options that received by far the greatest level of 
endorsement were that purchasing an e-book online was far more convenient than 
going on foot to a bookstore (28 per cent; n = 57), that it was cheaper than the paper 
version of the book (24 per cent; n = 50). and that it is easier to search the content of 
e-books (17 per cent; n = 34). Other reasons that were nominated by small proportions 
of respondents were that e-books are easier to store (7 per cent; n = 15), offer a 
multimedia format (5 per cent; n = 11), that they are easier to carry around (5 per cent; 
n = 10), fulfil a personal need to use the latest gadgets (5 per cent; n = 10), can be 


E-books: a 
survey of users 
in the UK 


519 





520 


electronically annotated (4 per cent; n = 9), fulfil a need to own an electronic copy 
along with a hard copy (4 per cent; n = 8), and are easier to make copies from for 
friends (1 per cent; n = 2). 


Ever borrowed an e-book from a library i 

Respondents who were aware of e-books were asked whether they had ever borrowed 
an e-book from a library using an online link. Among those who answered, around one 
in seven (13 per cent; n = 213) said that they had done this. Male respondents (17 per 
cent; n = 108) were more likely than female respondents (11 per cent; n = 105) to say 
they had borrowed an ebook (X? = 8.7, df = 1, p < 0.003). In addition, younger 
respondents aged 18-24 (24 per cent, n= 38) were much more likely than older 
respondents to have borrowed an e-book (X? = 38.2, df = 5, p < 0.001). 

Of those respondents who had borrowed an e-book, a further question asked them 
about the frequency of this behaviour. Among those who answered this question, 
around three in ten (29 per cent; n = 63) had done so just once, more than four in ten (44 
per cent; n = 95) had done so two or three times, one in five (19 per cent; n = 41) had 
done so between four and nine times, and a small proportion (7 per cent; n = 15) had 
borrowed ten or more times. Female respondents were more likely than male 
respondents to have borrowed e-books on just one occasion (36 per cent; n = 381 vs 24 
per cent; n = 239; X? = 9.8, df = 4, p < 0.04). 

Among those respondents who had ever borrowed an e-book and responded to a 
further question about preferences for hard copy or e-versions, a greater proportion 
still preferred a hard copy (37 per cent; n = 79) over an electronic copy (28 per cent; 
n = 60), while more than one in three (36 per cent; n = 77) said they liked both the 
same. 


Equipment used to read an e-book 

All respondents who had e-book awareness were asked about the equipment they used 
to read e-books. Respondents could indicate any equipment they used, which meant 
that some respondents endorsed more than one piece of equipment. Of those who 
replied, the great majority (91 per cent; n = 1, 467) installed a desktop reader on their 
personal computer. Fewer than one in ten used a PDA (9 per cent; n = 149), while 
smaller proportions used either a dedicated e-book reader (7 per cent; n = 105) or a 
palmtop (8 per cent; n = 44). 

Finally, all respondents with e-book awareness were probed for their attitudes about 
using e-books. The opinions examined here covered both perceived advantages and 
disadvantages of e-books. For overwhelming majorities of respondents to this 
question, e-book technology was not regarded as too difficult to use (94 per cent; 

= 1,525) or too expensive (92 per cent; n = 1, 481). However, more than one in two 
(56 per cent; n = 906) said they did not like reading long text extracts on a screen. This 
last concern was much more prevalent among female respondents (61 per cent; 
n = 576) than male respondents (49 per cent; n = 329; X? = 22.5, df = 1, p < 0.000). 
One in three (32 per cent; n = 519) felt that there are limited titles available in electronic 
form at present. One in four (25 per cent; n = 396) also believed that screen resolution 
can make reading difficult. 


EA 


à) 








Discussion 

The research reported in this paper represents an early survey of e-book users in the UK. 
While many market projections concerning the future of the e-book sector have examined 
sales volumes and the numbers of electronically available titles being produced by 
publishers, this research has sought feedback from e-book users themselves about their 
e-book experiences. It adopts the position that a better understanding of the future 
potential of e-books can be obtained by finding out from consumers about their e-book 
behaviour, e-book preferences, and problems they may have experienced with e-books. 

Conducted among an online survey panel of internet users, this survey has indicated 
widespread awareness of e-books among the internet population. Further, it found that 
more than half of respondents here who knew about e-books had accessed an e-book 
online at least once on a trial basis. More importantly, perhaps, was the finding that 
nearly four in ten of respondents with e-book awareness had bought at least one e-book. 
These figures confirm earlier observation from the US that the e-book market holds 
much promise and is already becoming established (Mayfield, 2001; Spiselman, 2001). 

It is not enough, however, simply to know that the internet savvy population is 
showing an interest in e-books. This new market needs to be understood in terms of the 
motives of its early consumers and their opinions about the use of e-books. The growing 
e-book market does not immediately spell the end for traditional hard copy books or 
traditional reading. What has become clear from this survey is that from the opinions early 
e-book users hold about reading e-books and from the evidence of the kinds of e-books that 
are most popular in the early e-book market, e-book use is fairly specific in type. 

E-book consumers do not invariably prefer e-books over hard copy books when it 
comes to reading. Indeed, many e-book users do not like reading extensively from a 
computer screen. They still feel more comfortable reading from the page. The most 
popular early buys are reference works — both specialist and generalist — and these 
represent the kinds of publication that readers do not customarily read from cover to 
cover. Instead, reference works are consumed in bite-size portions. For this kind of 
intellectual snacking, reading from the screen works perfectly well. In fact, e-books 
have the further advantage that they are easier to search than hard copy versions and 
can be more readily annotated. 

For novels and non-fiction works such as biographies, that readers would tend 
to read from cover to cover, the electronic reading environment works less well. 
With these publications, readers probably prefer to sit comfortably in an armchair 
with the hard copy on their laps. Sitting forward at a computer screen is a less 
relaxed and comfortable way of consuming these works. 

To sum up, e-books represent a growing market. They are often cheaper to buy than 
hard copy books and it is easier to store large quantities of text. Buying online is often 
more convenient that travelling to the nearest bookstore. At present though, the e-book 
market must look to its strengths. Consumers prefer using reference books to all others 
in an electronic environment. Given the need for publishers to bring more titles to the 
electronic marketplace, this is the sector they should perhaps focus on to begin with. 


Notes 


1. The survey used the e-global panel operated by edigitalresearch.com at www. 
edigitalresearch.com 


2. The questionnaire on e-books can be found at www.edigitalresearch.com 
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Abstract 

Purpose ~- The first in a series on goal-based information modelling, this paper presents a literature 
review of two goal-based measurement methods. The second article in the series will build on this 
background to present an overview of some recent case-based research that shows the applicability of 
the goal-based methods for information modelling (as opposed to measurement). The third and 
concluding article in the series will present a new goal-based information model — the goal-based 
information framework (GbIF) — that is well suited to the task of dccumenting and evaluating 
organisational information flow. 

Design/methodology/approach — Following a literature review of the goal-question-metric (GQM) 
and goal-question-indicator-measure (GQIM) methods, the paper presents the strengths and 
weaknesses of goal-based approaches. _ 

Findings ~ The literature indicates that the goal-based methods are both rigorous and adaptable. 
With over 20 years of use, goal-based methods have achieved demonstrable and quantifiable results in 
both practitioner and academic studies. The down side of the methods are the potential expense and 
the “expansiveness” of goal-based models. The overheads of managing the goal-based process, from 
early negotiations on objectives and geals to maintaining the model (adding new goals, questions and 
indicators), could make the method unwieldy and expensive for organisations with limited resources. 
An additional challenge identified in the literature is the narrow focus of “top-down” (i.e. goal-based) 
methods. Since the methods limit the focus to a pre-defined set of goals and questions, the opportunity 
for discovery of new information is limited. 

Research limitations/implications — Much of the previous work on goal-based methodologies has 
been confined to software measurement contexts in larger organisations with well-established 
information gathering processes. Although the next part of the series presents goal-based methods 
outside of this native context, and within low maturity organisations, further work needs to be done to 
understand the applicability of these methods in the information science discipline. 
Originality/value — This paper presents an overview of goal-based methods. The next article in the 
series will present the method outside the native context of software measurement. With the 
universality of the method estabiished, information scientists will have a new tool to evaluate and 
document organisational information flow. 


Keywords Information, Modelling 
Paper type Research paper 


Introduction 


As the editor of a recent issue of Aslb Proceedings (Boyd, 2004a), the author suggested Asib proceedings: New Informatic 


that new methods were needed to contextualise and understand the increasingly 
complex information eco-system. Over the past three years, he has presented in this 


journal derivatives of the goal-question-indicator-measure (GQIM) paradigm as one Sete ee ee 


promising information modelling method (Boyd, 2002; 2004b; 2005). This work and 
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additional case-based research has led to the development of the goal-based 
information framework (GbIF). Over a series of three articles, the author will present 
the thinking that led to the GbIF as well as an initial specification of the framework. 

This first paper presents a review of the goal-based literature, paying particular 
attention to GQIM and its predecessor goal-question-metric (GQM). The second article 
in the series will overview the recent case-based research, placing the previous work 
that appeared in this journal in context. The third and concluding article will present a 
specification of the model in the form of a user’s guide for researchers and 
practitioners. 

While an attempt has been made to keep the articles as free of jargon as possible, the 
nature of the subject does not always make that possible. As such, terms and acronyms 
are generally defined when first used. Additionally, frequently used terms and 
acronyms are outlined in Table I. 


Review of goal- -question-metric (GQM) literature 

This paper overviews two of the most promising goal-based methods: Basili and 
Weiss’ GQM paradigm (often attributed in the literature to Basili and Rombach, 1988) 
and Park et al.’s (1996) GQIM. Of the two, GQIM seems to be the most complete and 
useful as the basis for a new information modelling technique. 


The GQM paradigm 

The GQM paradigm was introduced over 20 years ago as a method to collect valid 
software engineering data (Basili and Weiss, 1984). Since then, some of the world’s 
premiere software development organisations, such as NASA, IBM, HP and Nokia 
have used or experimented with GQM in several contexts. Although later adapted, the 
core fundamentals, of GQM as they were first presented by Basili and Weiss (1984, 
pp. 728-32) are as follows: 


Goal-question-metric (GQM) A method to collect software engineering data, 
whereby measurement goals are established, 
questions linked to goals are posed and metrics are 
derived to satisfy the questions 

Goal-question-indicator-measure (GQIM) The GQIM method is a way for software evaluators 
to ensure that the software measurement achieves 
pre-determined business objectives. An off-shoot of 
GQM, GQIM adds an “indicator” definition step. 
Indicators include tables, graphs or other graphical 
representations of data that link back to questions 

Goal-based information framework (GbIF) A new goal-based method, based on GOM and 
GQIM, presented in this research. GbIF takes the 
early GQM/GQIM research beyond its roots in 
software engineering to provide a generic evaluation 
and documentation method to understand 
information retrieval and exchange 


Information flow The way information moves through a system or 
organisation 
Low maturity organisation (LMO) An organisation without an innate mtormahon 


processing competency 





() 


(2) 


(3) 


(4) 


(5) 


©) 


Establish goals of the data collection. First, before any data are collected, goals 
for the measurement effort must be established. Goals are categorised as either 
context specific or generic. That is, goals which are of interest within a single 
project and goals which are relevant outside a specific project context that may 
be of interest to software engineers, programmers and managers in general. 
Goals are used to ensure that data collected are relevant to the problem area. 


Otherwise, data may be collected that is incomplete or out of context 


(“incomplete patterns or no patterns are discernable”, p. 729). 


Develop a list of questions of interest. With goals of the project established, a set 
of questions that must be answered is derived. Each goal will have several 
questions associated with it. If goals address the qualitative reasoning of the 
study, questions frame the future quantitative parameters of the study. This 
process not only refines the goals, but also forces the information seeker to 
consider data collection before any resources are committed. If questions are 
unclear, do not relate back to a goal or cannot be answered, the information 
seeker can reconsider the data collection exercise. 


Establish data categories. This step essentially assigns a purpose or reason for 
the data collection. Categorisation ensures that all of the relevant topical areas 
have at least one question assigned to it, or that all of the questions are not 
concerned with essentially the same measurement factor. 


Design and test the data collection form. In environments where the 
information-seeking exercise is secondary to a deliverable (such as software 
development), the use of a data collection form ensures that data are collected as 
a matter of course. Without the form, old versions of documentation or 
organisational memory must be relied on. Basili and Weiss recommend a short, 
tick-box form that adheres to the following design principles (1984, p. 730): 

* fit on a single sheet of paper; 

* could be used in several (contextual) environments; and 

* permitting the user some degree of flexibility. 

Collect and validate data. In this step, data are collected and forms are checked 
for completeness, consistency and correctness. Interviews may be conducted in 
cases where there may be ambiguity in the data capture. Basili and Weiss (1984, 
p. 732) recommend keeping the time between data capture and validation to a 
minimum to insure accuracy. Otherwise it may be difficult to clarify things 
weeks or months later. 

Analyse data. Finally, data are analysed and mapped back to each question, 
thus deriving an answer. With the questions answered, it should be clear that 
the goals of the study have been satisfied. 


In conclusion, Basili and Weiss offer a series of recommendations for data collectors, 
lessons learned and advice for avoiding data collection pitfalls (Table ID. 

Since its introduction, GQM has been used quite extensively for software quality 
and measurement and has evolved to the following template (Park et al, 1996; 
Mendonga and Basili, 2000): 


Analyse “object of study” in order to “purpose” with respect to “focus” from “point of view”. 
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Table II. 
Lessons learned and 
avoiding pitfalls 


Procedural lessons learned Non-procedural lessons learned Avoid data collection pitfalls by 


Clearly understand working Understand environmental Select data collectors that are 
environment and specify data factors that may influence or familiar with environment or 
collection procedures affect data context 


em 


BS 


circumstances and collection required to validate and analyse and methodology prior to 

procedures data beginning analysis 

Timely data validation is vital Data may be sensitive — results For initial efforts, keep data 
should not be used against staff collection goals small 
involved in collection 


ww 


4 Minimise (data collection) Be mindful of the Hawthorne Design data collection 
overhead on staff effect — i.e. monitored instrument so that it is 
behaviour may change independent of a particular 


project — can be reused and will 
be understood in later projects 
If automated data collection is Contractors (or customers) may Integrate data cotlection into 
used, validate data immediately feel that data are proprietary. project tasks. Automate as 
Rules for collection néed to be much as possible 
agreed in advance 


Source: Basili and Weiss (1984, pp. 735-6) 


or 


And it now incorporates the following six steps (Briand et al, 1997, p. 3): 


(1) Characterise the environment. Identify the characteristics of the organisation 
and project or projects to be measured. 


(2) Identify measurement goals and develop measurement plans. Define 
measurement goals based on the information in step 1. 


(3) Define data collection procedures. Define data collection procedures for all 
measures defined in step 2. 


(4) Collect, analyse and interpret data. 


(5) Perform post-mortem analysis and interpret data. Compare data collected in 
step 4 with organisational baseline. 


(6) Package experience. Structure results into reusable form to be used in the future. 


Goals, questions, indicators and measures (GQIM)[1] 

An off-shoot of GQM, another powerful evaluation method is Park et al’s GQIM 
methodology (Figure 1). Developed by researchers at the Software Engineering 
Institute, the GQIM method provides a powerful way for software evaluators to ensure 
that the software measurement achieves pre-determined business objectives. This 
method starts by asking, “what is it that I want to know?” not by asking “what 
measures should I use?” The GQIM process has ten steps (Park et al, 1996, p. 23), see 
Figure 1. 

GQIM, like GQM was designed with software measurement goals in mind, but is far 
more comprehensive. Park et al. (1996, p. 25) point out that the method can be used with 
any organisational goal, but caution that several iterations may be needed at steps 2-4 
to maintain traceability to overall goals. Encompassing much of the construct of 
traditional GQM, the ‘T’ or indicator step is raised in profile to warrant inclusion in the 


Staff should be familiar with Do not underestimate resources Establish data collection goals- ore 


on 


Y- 


m 










1. Identify business requirements a ily ih P 
2. Identify what you want to know or business objectives 


learn 

3. Identify sub-goals 

4. Identify entities and attributes 
related to sub-goals 

5. Formalise measurement goals 

6. Identify quantifiable questions and 
the related indicators that will be 


i Questions that 
used to help achieve measurement y Mesie na 
goals goal 
7. Identify data elements that will be o ar ESAE ESS pe graat K grenersese seen: 
as chal ral 
collected to construct indicators s charts, graph 
8. Define the measures to be used, answer questions SST 
and make these definitions Measures that 
. are used to 
operational sonatnact 
9, Identify the actions that you will era a inf AS ie 
take to implement the measures Data from which 
10. Prepare a plan for implementing reed.” 
the measures 


methodology title. Also, the first four steps (outside the scope of traditional GQM) are 
used to frame organisational objectives that can be used as the basis for the 
measurement goals. 

The process begins with the identification of business goals (step 1). Although it is 
possible to start with lower level goals, in doing so the project may lose the support of 
senior managers (who may consider the project too operational to warrant their 
attention). One suggestion is to start with the goals of the most senior stakeholder, be it 
the project champion, project sponsor or, if necessary, the project manager. To generate 
business goals, the researchers recommend structured brainstorming or the Nominal 
Group Technique (1996, p. 26). Before proceeding to the next step, cross-over goals are 
combined and the list is prioritised. 

Step 2: “Identifying what you want to know or learn”, begins to map a path from 
high-level goals to operational measures. It begins by asking what quantitative 
information is desired. Starting with one of the goals outlined in step 1, the 
stakeholders are identified (groups or people whose concerns are being addressed) and 


mental models are created. This is similar to the step in GQM where the point of view is- 


specified. Next, entities (things to be measured and influenced) are identified. The Park 
(1996, p. 29) research team identifies four types of process entities: 


(1) inputs and resources; 

(2) products and by-products; 

(3) internal artefacts (for example, inventory and work in process); and 
(4) activities and flowpaths. 


For each entity, questions are asked that seek to elicit information that would be useful 
in managing the goals identified in step 1. Questions generally included descriptors 
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Figure 1. 
The GQIM model 
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such as: How big? How much? How many? How fast? How long? Cost? etc. With that, 
additional questions are asked about the processes as a whole to identify additional 
entities or if anything were missed. These questions revolve around benchmarks, 
customer/stakeholder perceptions, constraints, etc. This cycle is repeated for each goal 
that was identified in step 1. 

Step 3 is the link-step that connects the high-level business goals to specific 
measurement goals. Questions (identified in step 2) are grouped into related topical 
areas according to the issues that they address. With these grouped sub-goals, the next 
step (4) is to refine entities and attributes. The attributes, or characteristics of entities, 
are the things that if quantified will help to answer questions (Park et al, 1996, p. 40). A 
somewhat pedantic, but important, point is the difference between an attribute 
(characteristic of an entity) and a measure (scales and rules that assign values to 
attributes). Park et al warn against spending too much time and energy making 
distinctions at this point. f 

This process in step 4 may also lead to refining the sub-goals and related questions. 
The first four steps (1-4) have been added to the GQM paradigm to “get to the point 
where the GQM paradigm of Basili and Rombach can be applied effectively” (Park 
et al, 1996, p. 43; Basili et al, n.d.; Basili and Rombach, 1988). Step 5, formalising the 
measurement goals, encompasses the GQM paradigm (outlined above) comprised four 
elements: 


(1) Object of interest. 

(2) Purpose. 

(3) Perspective. 

(4) .Context (or in Park et al’s terms, description of environment and constraints). 


The object of interest is the “thing” of study that needs to be better understood, 
evaluated or improved. Examples include: product, process, activity, and metric. Park 
et al. (1996, p. 46) lists purpose as understand, predict, plan, compare, assess or 
improve; whereas, Briand (1997, p. 21) defines six types of models: characterisation, 
monitoring, evaluation, prediction, control and change. The purpose should be clearly 
defined without any ambiguity. Perspective denotes the point of view from which the 
measurement activity takes place. As team members will undoubtedly see things 
differently according to their position, it is important to construct and define measures 
from the point of view of the user. To avoid out of context use of the results it is 
important to define the constraints that may impact the measurement results. This is 
defined in the GQM model as environment. 

Now it is time to formalise the above sub-goals, entities, attributes and questions into 
measurement goals. The tasks associated with this step are to (Park et al, 1996, p. 51): 


* after reviewing the above, identify information needed; 
+ identify activities needed to acquire that information; 
* “Express measurement goals as structured statements that identify the objective, 


purpose, perspective, environment and constraints associated with the 
measurement activity.” and 


e “Identify and record the business sub-goal that each measurement goal 
addresses.” 


$, 


Y 








Now that measurement goals are defined, it is a good idea to test traceability back to 
sub-goals, goals and business objectives (Figure 2). This exercise will not only ensure 
that all goals (and objectives) are measured, but also that there are no extraneous 
measurements (not linked to a specific goal). 

As discussed above, steps 1 through to 4 are necessary to frame the measurement 
goals. With the first steps of GQM completed (measurement goals), quantifiable 
questions should be identified and indicators can now be constructec. It is also at this 
point that the “indicator” step (the ‘T’ in GQIM) is added. Indicators include tables, 
graphs or other graphical representations of data. Park et al. strongly recommend that 
validation processes take place before distribution, as poorly constructed indicators 
and questions can be misleading to the audience (Park et al, 1996, p. 59)[2]. One specific 
recommendation is to envisage unexpected results in the context of the proposed 
indicators. By evaluating how this will be received or interpreted, questions and 
indicators can be refined in a meaningful way. The process for identifying quantifiable 
questions and indicators is as follows: 


* select one of the measurement goals; 

* identify questions that relate to this goal; 

* prepare indicators that will address questions and communicate results; 
* prioritise indicators in order of importance; and 

* repeat for other measurement goals. 


Step 7 involves identifying the actual data elements that have to be colleted to 
construct indicators. The important thing to remember — particularly in an 
information-seeking context — is that the data that are to`be collected at this point map 
directly back to measurement goals, which should in turn map back to actual business 
goals (Figure 3). Data elements can serve multiple indicator needs, but no data are 
collected for collection’s sake. With data elements, measures are identified. 

Once the data elements are identified, measures are defined (Step 8). This means a 
detailed description of how the measure is constructed (including formulas and/or 
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Figure 2. 
Maintaining traceability 


(objectives and goals} 
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Figure 3. 
Maintained traceability 
(GQIM) 
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SQL) and how the data are obtained. Two criteria must be satisfied in the definition of 
measures (Park et al., 1996, p. 67): measures must be: 


(1) clearly communicated, letting others know exactly what was measured and 
how; and 


(2) repeatable — a neutral party, with the operational definition, should be able 
to reconstruct the measure. 





For GQIM to be repeatable and useful beyond a single project, operational definition 
checklists[3] and documentation forms should be created for the organisational and 
domain-specific problem set (i.e. information integration). These checklists should not 
point out what the user should do, but rather give guidance on how the data can be 
interpreted correctly (Park et al, 1996, p. 84). 

Steps for defining measures include: 

* Choose an indicator for definition. 

* Ifa suitable framework (checklists and forms) exists, use it to create definitions. 
ff not, checklists and forms need to be created and special care must be taken to 
define the measure so that it can be communicated and is repeatable. 

* Repeat until all rules are defined for all data elements. 


Now is the time to translate measures into an operational plan. Step 9 encompasses the 
analysis of the current measurement (information retrieval) situation within an 
organisation as a baseline to launch the action plan. This step involves three activities 
including analysis, diagnosis and action. 


Ba 


rr 








Analysis is the determination of the current baseline and diagnosis is the evaluation 
of the data elements that that organisation is currently using in the ccntext of the new 
measurement plan. Questions that could be used in analysis and diagnosis include 
(Park et al, 1996, pp. 88-9): 

* What data elements are required for my goal-driven measures? 

* Which data elements are collected now? 

* How are they collected? 

* What are the processes that provide the data? 

* How are the data elements stored and reported? 

* What existing measures and processes can be used to satisfy our da 

requirements? à 

* What elements of our measurement definitions or practices must be changed or 

modified? 

* What new or additional processes are needed? 


The action sub-step is the distillation of the results of the analysis and diagnosis into 
an implementable action plan, including task definition, resource allocation and 
assignment of responsibilities. This could include (Park et al, 1996, p. 90): 


* identification of data sources; 

-_ defining data collection methods and reporting; 

* specifying data collection and storage tools; 

* defining frequency of data collection and milestones; 

* documentation of data collection procedures; 

* defining who will use the data; 

* defining how the data will be reported and analysed; and 
* packaging into a data definition and process guide. 


With the information collected in the proceeding nine steps, a complete and traceable 
path is created that links data elements back to the over-arching business (or 
information-seeking) objectives of an organisation (Figure 3). The last step (10) in the 
GQIM process is the preparation of a plan[4]. 

The next section overviews several industry experiences, mostly in software 
engineering environments, with implementing GQM and goal-driven measurement. 


Industry experiences with goal-driven measurement 

There are several examples that stand out in the literature where GQM was used with 
success. As winner of the first IEEE Computer Society Software Process Achievement 
Award, the ground-breaking work at NASA’s Software Engineering Laboratory (SEL) 
incorporates some of the key aspects of GQM paradigm in its process improvement 
process (McGarry et al., 1994). Again, focused on software measurement, this so-called 
“bottom-up”[5] approach relies on incorporating past experiences into an ongoing and 
iterative measurement programme. The three steps in the SEL approach are (McGarry 
et al, 1994, p. 2): 
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(1) Understanding. 
(2) Assessing. 
(3) Packaging. 


First, a thorough understanding of the current environment is gained. Next, goals are 
used to determine improvements that need to be made (assess) and lastly, process 
changes are implemented (package). Thus the cycle begins again and iteratively 
continues. Although the SEL paradigm is focused on delivering software process 
improvements — in six years, the error rate of completed software dropped 75 per cent 
(McGarry et al., 1994, p. vii) — the methodology provides an interesting framework for 
modelling information flow. 

Another study (Mendonca et al, 1998) shows how the approach was used at IBM 
Software Solutions Division Toronto Laboratory to analyse customer satisfaction data. 
This study compares GQM, a top-down measurement approach, with the AF (Attribute 
Focussing) knowledge discovery (bottom-up) technique. In this situation, GQM 
provided a measurement context and an ongoing framework to run a measurement 
programme. The AF technique gave researchers a tool to analyse legacy data. Many 
measurement frameworks are prone to: “(1) collecting redundant data, (2) collecting 
data that nobody uses or (3) collecting data that might be useful to people that do not 
even know the data exist within the organisation ” (Mendonca et al., 1998, p. 484). It is 
for these reasons that they stress the importance of ongoing measurement and the use 
of a traceable methodology such as GQM. 

At IBM, a bi-directional approach set out to: “understand the ongoing measurement, 
structure of the measurement and explore the legacy data” (Mendonça et al, 1998, p. 
487). The first (“top-down”) phase incorporated GQM to capture user goals and map 
them to the underpinning data. However, a weakness of the top-down approach is that 
it can ignore or overlook certain valuable data that is already collected within the 
organisation. For exploratory data discovery, bottom-up approaches are necessary. 
The second (bottom-up) phase uses AF to discover new and interesting facts. This 
combination provides a holistic view of the measurement needs of the organisation. 

This study shows that GQM can adapt to be used in organisations with existing 
measurement frameworks and is valuable in identifying extraneous or no longer useful 
metrics (Mendonça et al, 1998, p. 489). Since GQM maps end-user goals to metrics, if 
metrics exist that do not map to goals, then the importance of gathering that 
information must be examined. 

Recently, in another study Boyd (2002) presented an illustration of the adaptability 
of GQIM. As a model for customer satisfaction measurement with e-commerce web 
sites, this adaptation was outside the context of software measurement, although still 
measurement focused. Other examples of the adaptability of GQM are put forth by Pai’ 
(2002) in the context of Software Quality Function Deployment (SQFD) and Kilpi (2001) 
at Nokia. As GQM was originally developed as a software measurement framework, 
the use in requirements engineering seems to make sense. SQFD is a five step process 
used for eliciting and defining customer requirements. When used with GQM 
(Table ID), the combined process quickly identifies extraneous requirements leading to 


‘enhanced usability (Pai, 2002, p. 23). In practice, this combined approach resulted in a 


15.2 per cent reduction in system size at the CS Foundation (Pai, 2002, p. 23). 














SQFD process SQFD with GQM 

Customer requirements are solicited and recorded Record customer requirements in report form 
Requirements are converted to a measurable Identify goals of the project for user, developer 
technical specifications and manager perspective 

Requirements are mapped to product Ask questions derived from goals and measure 


specifications (with customer feedback) to create against requirements reports 
a correlation matrix 


Requirements are prioritised by customer Modify and reconfirm the improper 
requirements, then complete matrix 

Priorities are determined by multiplying Priorities are determined by multiplying 

customer priorities with matrix customer priorities with matrix 


Source: Pai (2002, pp. 21-2) 


Through the adaptation of GQM, much of the overhead normally associated with the 
GQM methodology was reduced at Nokia (Kilpi, 2001, p. 72). The basic differences in 
the Nokia method include: 


* uses predefined metrics from a metrics library; 
* automates data collection; and 
* does not utilise full-time measurement team. 


Kilpi goes on to argue that management has the responsibility to set the process 
improvement strategy including goals, and that most goals are common across 
projects anyway. Nokia also automates data collection as part of the project procedure. 
Therefore, there is a cost saving in data collection as the laborious goal-setting process 
is avoided and there is no manual data collection requirement. The overheads 
associated with a sample GQM-based measurement programme (vs the Nokia way) are 
(Kilpi, 2001, pp. 72, 76): 
* Defining the measurement programme equates for roughly 30 per cent of effort, 
whereas continuing the measurement programme requires 70 per cent of the 
effort. 


« An 11 person-year project requires three months of effort to administer. 


* Using the example above, the total person-hours required to administer 
traditional GQM is greater than 500, whereas the Nokia way would require less 
than half that. 


Rifkin and Cox (1991) studied 11 divisions of eight organisations — including Contel, 
Hewlett Packard, Hughes Aircraft, IBM, McDonnell Douglas, NASA, NCR and TRW — 
with reputations for excellence in measurement. Although not explicitly restricted to 
goal-driven approaches, the primary lessons learned during this study revealed best 
practice areas. First, they found that organisations that embraced the object of 
measurement (in a software quality context — “errors”) reduced the stigma of negative 
associations. Thus, employees knew that the delivery of bad news would not be 
punished and organisation-wide discussion became easier. 

In best practice organisations, measurement is not conducted in a vacuum. The 
measurement programme was conducted as part of a culture of quality, not within the 
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Table W. 
SQFD and GQM 
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micro-context of software process improvement. This ensured that all levels of the 
organisation were bought into the programme ~ not just management or just 
engineers. With across-the-board buy-in (and reward structures put in place to 
encourage participation), people were motivated to participate and expectations were 
managed across all stakeholder levels. Key to the success of the programmes was 
getting “the right information to the right people”. 

Other cross-organisational patterns that emerged revolved around the metrics 
themselves. Successful organisations generally started small — with one measure ~ 
and broadened the programme on the back of success. However, to reduce ambiguity 
and the level of compliance effort, these “mico-metrics” were vigorously defined and 
were gathered using automated tools. Programmes often took an evolutionary, 
iterative, approach, but it was recognised that first efforts might be “throw-away” (as 
experimental and ever-changing). Some organisational metrics survived scrutiny, 
others did not. Regardless, an “early win” is deemed necessary for ongoing success and 
survival. For ongoing success, the measurement programme must add-value to 
development efforts and line-personnel must be empowered to act upon the 
information. Despite delivering early wins and an iterative approach, successful 
organisations recognise that measurement programmes sometimes require cultural 
shifts and changes in attitude. Even when all other success criteria are put in place, this 
does not happen quickly. 

A later study that focused on goal-driven measurement experiences (Goethert and 
Hayes, 2001) included a series of case studies where GQM had been deployed from 
three perspectives: 

(1) ina global software firm; 

(2) studying the impact of software process improvement; and 


(3) with enterprise performance management from a “local perspective”. 


The lessons learned generally correspond with the Rifkin and Cox study, including the 
necessity to pilot implementations (start small and build), understand that 
development of a measurement programme takes time, use automated tools, define 
measures and metrics, and motivate the right behaviour. A summary of multi-case 
experiences is outlined below (Table IV). 


Conclusion: strengths and weaknesses of goal-based methods 
Clearly defined and widely accepted metrics and models are crucial for measurement 
success. A goal-oriented approach is helpful in three ways: 


(1) it “ensure[s] the adequacy, consistency and completeness of the measurement 
plan, and therefore, of data collection”: 


(2) it “managefs] the complexity of the measurement programme”; and 
(3) “stimulate[s] a structured discussion and promotes consensus about 
measurement and improvement goals” (Briand et al, 1997, p. 2). 
Specifically, this research has found GQM to be: 


e Rigorous. As seen above, the literature puts forth several examples of the 
successful use of GQM and goal-driven derivatives. The real-world use of GQM, 








Goethert and Hayes (2001, p. 25) 


Rifkin and Cox (1991) 





Maintain traceability 
Define type and purpose of each indicator 


Start small and build on success 

Develop comprehensive list of indicators to detect 
trends and hidden tradeoffs 

Customise the indicator checklists for the 
organisations 

Use checklists to define measures 


Use specialised tools to disseminate information 


Pay close attention to privacy issues 
Plan to address cultural issues 


When there is no consensus on how to proceed, 
base decisions on cost 
Use pilot implementations 


Recognise time required to develop measurement 
programme 

Make the tool fit the process 

Do not be afraid to revise initial assumptions 


Beware of the different perspectives of various 
stakeholders 


Decrimiralise the object of measure. Make it ok to 
discuss potentially bad news 

Make measurement part of larger programme — 
create a culture of measurement 

Start small (with one measure) 

Rigorously define measures 


Automate collection and reporting 


Motivate staff to become involved. Put rewards 
structure in place to encourage measurement 
efforts 

Set expectations through articulated goals in a 
focussed manner (Le. cost, schedule, quality) 
Involve all stakeholders in goal setting 

Earn trust of participants by not punishing bad 
news 

Take an evolutionary approach to programme 
development å 

Plan to “throw” the first effort away. Use a pilot 
study 

Get the right information to the right people 


Strive for early success, deliver early win 
Make sure that the effort “adds-value”(i.e. 
something is delivered from the effort) 
Empower employees to use information 


Take a whole process point of view — 
measurement is only one piece of a greater whole 
Understand that measurement and adoption 
takes time 


Informatior 
modelling 


535 





Table IV. 
Summary of multi-case 
experiences 





and subsequent publication of case studies, spans nearly 20 years indicating that 
the methodology is truly useful and not just a passing “fad”. 


* Adaptable. As illustrated in the industry examples, as well as GQIM, the GQM 
methodology in practice today differs significantly from the original idea put forth 
by Basili and Weiss (1984). As a framework, GQM has proven itself to be adaptable 
to different organisations and the changing environments of software measurement. 
The very fact that it is adopted by and used in commercial organisations indicated 
that there is inherent value in the methodology. In the Darwinian world of software 
development, rarely do things that do not provide value survive. There seems little 
reason why GQM could not travel outside the context of that original use to be 
applied in information integration and retrieval scenarios. 

e Flexible. Not only is the framework adaptable, but also it is flexible as well. 
As seen in the Nokia, IBM and CS Foundation cases, GQM works well with 
additional methodologies and can be adapted for a particular organisation. GQM 
can also be restricted to a subset of goals and grow with success. 
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Although the literature is conspicuously absent of meaningful criticisms, GQM is not 
without limitations. One of the major weaknesses is the propensity for the number of 
metrics to grow to an unmanageable amount (Expansive). The production of questions 
is situation and even organisationally dependant. These two factors lead to questions 
of repeatability and limiting scope (“non-terminating”). Card (1993, p. 94) argues that 
GQM can very quickly grow beyond its usefulness; one study he references consisted 
of four goals that grew to over 100 questions. And since multiple teams produce 
different questions, the results are likely not repeatable. Lastly, he points out that 
questions that arise from the GQM exercise may not be answerable unless 
organisational changes are made. Given these limitations, Card feels that GQM should 
be used as a supplemental methodology. Although in conclusion of his editorial (1993, 
p. 95) he concedes that GQIM is better than what was previously available — that was, 
“largely nothing”. 

Another weakness, briefly touched upon by Kilpi is the overheads associated with 
managing GQM (Expensive) — from dedicated implementation teams to time 
consuming goal setting sessions and negotiations, it could be costly. This could be 
particularly limiting in low maturity environments. 

Additionally, as outlined by Mendonça et al, a “top-down” measurement approach 
alone does not allow for discovery and can often ignore or overlook legacy data (Focus). 
McKeehan et al, (1998, p. 5) take a more vitriolic tone when discussing the weaknesses 
of GQM by surmising that “although this approach [GQM] is better than none at all, it 
is beset with problems”. The researchers assert that GQM “fails to recognise that 
managers don’t always know what their goals should be”. They go on to suggest that: 


Top-down methods lack support and enthusiasm from practitioners. It encourages 
“data-manipulation”. With a set goals [sic], the data collection or processing procedures 
tend to produce results that show improvement because that people developing the measures 
are focused on the goal and what the numbers are expected to show (McKeehan et al., 1998, p. 
5). 


Although McKeehan et al. raise some interesting points, their references are unclear 
and thus the majority of their arguments remain unsupported by the literature. 
However, the point about managers failing to properly set goals is also highlighted by 
Wilson et al. in their recognition that business strategies (and consequently goals) may 
not be fully articulated: 


IT Staff like to ask “the business” (whoever that is) for the “business strategy” (whatever that 
is) — which they expect to be predetermined, formalised and explicit — so they can “support 
it” by “solving business problems” (Wilson et al., 2002, p. 198). 


The double-quotations and parenthesised comments indicate that Wilson et al do not 
necessarily believe that it always possible for an organisation to be fully aware of, or 
able to articulate, its strategies. If this is true, it is probably safe to assume that it is also 
true with organisational goal setting. That is why a structured, goal-elicitation process 
such as GQIM is desirable, as the process forces both managerial and IT staff 
participation. 

To reduce risks associated with the GQM approach, there are several learnings that 
can be gleaned from the literature: 


* Limit metrics, As pointed out above, a GQM programme can quickly grow out of 
control as new questions are added. However, at Nokia, it was recognised that 








many of the project goals and questions could be reused, thus reducing the costs 
of managing a GQM measurement programme. 


+ Start small and grow/pilot implementations. The programme is likely to morph 
and change. Therefore, it is a good idea to start with smaller, achievable, goals 
and to pilot projects before wide-scale rollout. 


-+ Be mindful of human and cultural issues. For the programme a be successful, 
people factors must be considered. Several suggestions were put forth 
recommending that resulting information be masked when presented and that 
information not be used to “punish” poor performers. Without the risk of the 
information being used against the participants, cooperation is more likely. 


* Automate data collection. Any additional work or overheads that make 
employees jobs more difficult will be resisted. Some (likely higher maturity) 
organisations will have the resources to assign dedicated personnel to a project. 
However, in low maturity organisations, as much of the information gathering as 
possible should be automated. 


Clearly, GQM/GQIM may not be the right tool for all information retrieval situations. 
However, given its history of success, rigor, flexibility and adaptability, it is likely that 
it will be useful in many modelling multi-source information retrieval scenarios. The 
next article in the series will show how goal-based methods have been adapted for 
information modelling. 


Notes 


1. Park et al denote GQIM as “GQ()M”. The parentheses have been eliminated in this paper in 
the interest of readability. 


2. At this point, Park et al. digress into a slightly pedantic discussion of the use of the word 
“metric” vis-a-vis “measure”. In their minds, GQM stand for “goal-question-measure”, not 
“goal-question-metric” as put forth by the earlier literature. But they feel that discussion of 
terminology is important in determining what is to be measured; over time a carefully 
crafted question may be far more useful than “exact percentages” (metrics). 


3. Park et al. (1996, pp. 66-82) present a series of operational definitions that can be used in a 


software quality context. However, since it is assumed that this research will be applied in 


other contexts, the section on defining terms (1996, p. 84) is not highly relevant. 
4. A template is provided by Park et al (1996) pp. 95-8. 


5. A note about terminology: here “bottom-up” refers to the incorporation of goals derived on 
the local level, as opposed to a top-down approach whereby goals are part of the universal 
goal framework (e.g. the Capability Maturity Model). Other references in the literature such 
as Mendonça refer to “bottom-up” as a data-centric approach and “top-down” as an 
objective-based approach. 
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Abstract 

Purpose — A research study into the use of information and communication technologies (ICTs) in a 
special educational needs (SEN) environment, as part of a larger project to develop a multimedia 
learning environment for this group. Benefits and barriers of ICT usage in this environment were 
examined, and attitudes and experiences of SEN teachers were explored. An enquiry into the 
information and other needs of the teachers formed part of the study, and the working environment 
was also researched, for contextual information. 

Design/methodology/approach — Qualitative depth interviews were undertaken in the working 
locations of the SEN teachers and assistants. 

Findings — The SEN working environment was found to have changed greatly in recent years. There 
was now a more formal and structured curriculum, and many attempts at activities designed to foster 
inclusion. Difficulties faced by teachers included a lack of and poorly functioning equipment, a paucity 
of appropriate learning materials, and unusual challenges posed by the differing needs of learners. The 
needs of teachers included ways of facilitating evidence af progress, lesson plans classified according 
to cognitive and accessibility levels, and administrative information. Advantages of using ICT ranged 
from enhancing the learning experience by offering a more personalised environment, to “liberating 
pupils” from problems such as physical cutting and pasting. 





Project @pple (Accessibility and Participation in the Werld Wide Web for People with Learning 

Disabilities) was a cross-disciplinary initiative, running from 2004 to 2006, led by Andy Minnion, 

Director of The Rix Centre for Innovation and Learning Disability at the University of East- 

London. Project @pple brought together multimedia producers from small and corporate 

business partners (Xtensis and Macromedia), the UK’s leading learning disability charity, 

Mencap, and researchers examining the following themes: advocacy (Karen Bunning and 

Rebecca Heath, Speech and Language Therapy, University of East Anglia); cognition (Rakesh Emerald 
Odedra, Psychology, University of East London); Knowledge (Helen Kennedy, New Media 

Studies, University of East London); learning (Mary Newman, Innovation Studies, University of 

East London); and usability (Peter Williams and David Nicholas, Information Science, University Aslib Proceedings: New Information 


College London). The project was funded by the ESRC, EPSRC and DTI’s PACCIT programme dain Pe 
(“people at the centre of communication and information technology”). The author is grateful to A i ae pia 
mi Troup ishing Limit 


all partners for their contributions to project @pple, which have played a role in the production 0001.253X 
of this paper. : , DOI 10.1108/00012530510634262 
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Originality/value — Most literature on using ICT for those with SEN focuses on physical rather than 
cognitive disabilities. There has been almost no literature on the views or needs of SEN staff, with 
regard to this topic. 

Keywords Communication technologies, Disabled people 

Paper type Research paper 


Introduction 

Project @PPLe: Accessibility and Participation in the World Wide Web for People with 
Learning Disabilities is a research project funded by the Economic and Social Research 
Council’s (ESRC) PACCIT programme (“People at the Centre of Communications and 
Information Technology”), Project @PPLe aims to examine how information and 
communication technologies (ICTs) can be exploited to enhance learning, 
communication and, hence, self-advocacy for those with learning disabilities. 
Specifically, the project will produce a multimedia learning environment (LE), which 
will be designed to facilitate individual learning programmes, and offer a teacher section 
to include material and facilities required and requested by staff. The project involves 
four types of partner: academic researchers from the communication science, 
informatics and cultural/technology studies fields; commercial multimedia producer 
partners, including Macromedia; and the voluntary sector, represented by the disability 
charity Mencap. 

Effective use of any ICT application depends on the context and environment within 
which it is to be used, including the constraints and barriers, and the attitudes and 
aspirations of the potential users. It also depends, of course, on the extent to which any 
system introduced actually meets the needs of those for whom it is provided. Thus, an 
important and early part of the project was to both examine the physical and social 
environment within which ICTs may be used, and also to develop an understanding of 
the experiences and attitudes of those who work with learning-disabled students, and 
their information and other needs. 

This paper describes a qualitative study of the environment within which the LE 
will be used, exploring the special educational needs (SEN) environment in general, the 
information and other needs of staff, the benefits they see — and maybe have already 
found — in using ICT, and the constraints and barriers they face. Findings from this 
element of the study informed the development of the LE which is at the heart of the 
project, providing learning resources for self-advocacy to match the requirements of 
the widest possible spectrum of users with disabilities. 


Aims 
The aims of this particular study were to explore the working environment of the 
teachers, and to identify information and other needs that might be addressed by the 
development of the LE. The objectives of the study were to examine: 
* the working environment within which teachers of SEN students operated, and 
what the major issues were considered by staff to be; ; 
* the information needs of teachers, with a view to addressing these in the 
. development of the LE; 
* subjects’ experience with ICT and determine how ICT has impacted upon i 
SEN LE and upon meeting information needs; 











* the advantages and benefits of using information technology; 


* the barriers to successful usage of ICT applications and how these can be 
overcome; and 


* what facilities teachers would like to see on the developing LE. 
Previous literature i 
The “user centred” approach within which the current project is predicated has been 
championed by Kuhlthau (1991, 1999), Wilson (1981, 1997) and Savolainen (1995, 1999), 
amongst others. Wilson (1981) claims that information need research should focus upon: 


... uncovering the facts of the everyday life of the people being investigated ... to understand 
the needs that exist which press the individual towards information-seeking behaviour [and 
that it is] ... by better understanding those needs we are better able to understand what 
meaning information has in the everyday life of people. 


A small number of studies have been carried out with regard to the information needs 
of teaching and academic staff. For example, Lent et al (1997) report on a survey, 
carried out in 1996, of faculty members at the University of New Hampshire, USA, to 
discover which serials they read. A 51 per cent response rate was achieved. Results 
showed that use of core lists for collection development decisions was not adequate and 
document delivery was not a full solution to information needs. Similarly, Reneker 
(1993) examined in some depth, the information seeking activities of the Stanford 
University academic community. Information needs were elicited in relation to 
perceived environment, source use, personal characteristics, and user satisfaction. 
Results revealed information seeking to be a function both of need and availability of 
information. In a somewhat different vein, Banwell and Edwards (1997) discussed 
various British library projects, including school governors’ information needs and 
access to information. 

Only one study to date appears to have examined information needs in a special 
school environment. Westbrook (2001) and colleagues carried out an information needs 
analysis of faculty staff at a private non-profit special school in Texas, serving the 
needs of dyslexic, attention deficit disorder (ADD) and attention deficit hyperactivity 
disorder (ADHD) children, in order to inform and improve the school library service. 
The aim of the study was.to “identify and delineate the professional information needs 
of faculty with particular reference to their curricular support issues” (Westbrook, 
2001, p. 40). Two main methods were used in the analysis: focus group interviews and 
questionnaires, the latter being constructed according to results elicited by the former. 

The data were analysed “using simple descriptive statistics” (Westbrook, 2001, 
p. 40), and identified various faculty information needs. These included: 


* a greater variety of high-interest books for students with low reading ability; 


* equipment needs — more VCRs, televisions and (presumably electronic) access to 
the library catalogue from the classroom; 


* various electronic resources including web site lesson-plans; 
* news updates related to professional issues; and 
* current contents notification. 


The study findings led towards a set of recommendations useful for strategic planning. 
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Method 

The main data-gathering instrument used in the present study was in-depth 
interviews, although participant observation was also undertaken in order to 
familiarise the researchers with the working environment of the subjects, and to gain 
insights into some of the difficulties and constraints they were under. Face-to-face 
semi-structured interviews were undertaken with both teachers and LSAs (learning 
support assistants). These were planned to last from 30 to 45 minutes, although many 
were either longer or shorter than this, depending on the availability of the interviewee 
and their disposition to answer in depth. Questions concentrated on the themes 
constituting the objectives of the study outlined above. 

Interview transcripts were “framework” analysed (Richie and San 1994). 
This approach involves a systematic process of filtering and sorting material into 
themes and key issues, and has been used often in health/medical research (Leydon 
et al, 2000), including studies undertaken by the present writers (Nicholas et al 
2003, 2005). Once such key themes had been established, which included a priori 
topics informed by the research aims, and issues raised by the interviewees, the 
original notes were then thematically indexed and “charted”. From this, concepts 
and associations were elicited, and the strength and extent of views, behaviour, etc. 
elicited. 


Sample 

Interviews were conducted with teachers and teaching assistants from two locations: 
an SEN school, and the special needs unit of a college of further education (FE), both in 
the West Midlands. Interviewees were: in the FE College — Head of the special needs 
unit; Co-ordinator for catering services; Three teaching assistants, including one also 
working as an administrator; Curriculum leader for learning disabilities; and Seven 
supported learning lecturers, including one in the performing arts (ie. and not 
supported learning) department. 

In the School — Head of ICT; and Deputy head of 14-19 year olds. 


Constraints 

Teaching is a demanding profession, and arguably teaching in a special needs unit 
incurs additional pressures. Many constraints were imposed upon the researchers due 
to the pressures under which the staff worked. For various reasons — usually staff 
shortages — subjects were forced to cancel or re-schedule meetings. On one particularly 
difficult day, a four-hour trip to one location yielded only one interview, as two staff 
were absent through illness, and a room had been double booked. There was also a 
serious problem involving the parent of one of the students, and, in the confusion, 
another pupil, who had been told not to undertake a particular class did so, as the 
substitute teacher was unaware of the ban.*In addition, it was possible that staff 
members did not wish to be interviewed to anxieties about their teaching practices or- 
about not using technology sufficiently or m a manner with which the researchers 
“approved”. In order to minimise the possibility of this, the researchers emphasised 
their impartial stance, confirmed both that interviewees’ views were all equally 
important and valuable, and that it was not only the views of ICT enthusiasts that were 
sought. 





Results 

The working environment 

The working environment of SEN teachers was found to be very high-pressured. As 
mentioned above, staff shortages created many problems, and there were other 
frustrations too, most notably, having to work with out-of-date or poorly performing 
equipment. However, many positives were reported too, including the effects of 
government legislation on the SEN curriculum, and recent moves to facilitate the 
inclusion of these pupils in the main body of student life and learning. Main themes 
that emerged were: 


* rate of change in the sector; 
e facilitating inclusion; 
.* evidencing work; and 
* poor functionality of equipment. 


Rate of change in the sector 

There was a general feeling that both the practice of teaching and use of ICT, within 
the education sector generally, had undergone massive change over the previous 
decade. Many interviewees had been in FE for at least ten years, two for considerably 
longer. Changes cited related to educational policy and consequent practice, and to the 
availability and use of ICT. With regard to educational policy, the exigencies of various 
disability discrimination Acts[1] has led to more integration of learning support into 
the mainstream sector of the college, as discussed below. Also, there is much more 
formal learning now. According to one respondent: 


.. there was no formal things that students could do. ... basically you did what you liked, 
there was no structure or anything to it. ... were Ofsted inspected now (and) a lot of the 
changes have been positive as well and meant better things for the learners — improved 
learning and more focused learning. 


With regard to change as related to ICT, one interviewee said that 15 years ago there 
were no computers at all in the college. Now, nearly every lesson involved their use 
either by the students and/or by their tutors (e.g. in preparation/presentation of 
worksheets, etc.). More is said about the use of ICT below. 


Facilitating inclusion 

The learning support unit (and also the college as a whole) is now obliged to do more to 
facilitate “inclusion”. This was said to have been beneficial — there is more integration 
of SEN into the college generally, and more recognition of the needs of the department. 
Special needs pupils who express an interest in a topic that is taught elsewhere in the 
college are now able to be taught by mainstream lecturers (albeit not with mainstream 
pupils). They are accompanied in these activities by a support tutor. This policy was 
said to be excellent for boosting self-esteem and self-confidence. 

Also with regard to inclusion, the situation regarding special needs pupils who do 
not have a cognitive disability, but who are physicaily disabled (e.g. visually impaired, 
deaf, etc.) is worth mentioning. As far as possible, these students follow a mainstream 
course. However, they need support lessons in language and communication, 
undertaken with a special needs tutor. One aspect of this is grammar, the reason being 
that they do not receive the auditory reinforcement of the rules of grammar, which 
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other students obtain all the time, through their every day interactions. Deaf students 
have to learn grammar “by rote”. Unfortunately there are “virtually no suitable 
learning materials” to help teach this. Interestingly, BSL (British sign language) has 
been afforded the status of a language, an indication of the importance of the gulf 
signers face between this and standard English. 

Another issue under the term “inclusion” is that of catering for different ability 
levels and unusual intellectual development. Some concern was expressed regarding 
the wide range of ability levels, and the fact that pupil intellectual development did not 
appear to conform to what might be expected. One interviewee gave the example of 
their development being like a ladder where the rungs are in the wrong order, with 
some missing. It was difficult to tailor work that was meaningful for each of the 
students, and there was a danger of some learners being left idle. Similarly, one 
interviewee asked whether “high level” resources would be available on the LE to those 
students with learning difficulties who have high IQs — the disabilities here being 
more to do with communication. This is a big challenge for the content developers. 
There is a vast amount of pedagogic material available on the internet. Whilst it may 
not be specifically designed for SEN students, it may be possible for the LE to build up 
a resource of links to these sites, or allow the teachers/users to build up this list 
themselves, 

Finally, there is an issue of how “inclusion” is defined. It was often understood, 
according to teacher’ accounts, to imply that an “open door” policy was needed with 
regard to student enrolment on various courses. In fact, the view of one interviewee 
was that a more realistic interpretation should be that it was aimed to provide a “best 
match between student and learning opportunity”. Thus, a course leading to a care 
certificate might be adaptable for participation by a visually impaired student, but this 
might not represent the most effective use of that student’s talents, as — both with 
regard to the job market and also the practicalities of the profession — it would be, 
unfortunately, very difficult for a visually impaired student to find gainful employment 
in the area, particularly if they were interested in working with children. A better 
match might be an alternative course. This would appear to have implications for the 
database of learning objects and the taxonomies to access them to be included in 
the LE. 


Evidencing work 

One issue that was clearly extremely important for staff was that of providing 
evidence of student performance and activities. Now SEN courses are more 
formalised “we ... have to follow the rules for accredited courses, like the rest of the 
college, so a lot of it is evidence based”. Computers really helped here, according to 
several interviewees, with printouts of activities undertaken on computers providing 
this — as discussed in the section on the benefits of using information technology, 
below. 


Poor functionality of ICT equipment available 

There was constant repetition by all interviewees that one of the major issues facing 
the college was that the equipment available was very poor and that this impinged 
greatly on their work: “we have things like equipment failure, and continual problems 
with printers and so forth, and things like that, internet breakdowns, things like that”. 








There were also complaints about a lack of equipment — a performing arts department, 
which produces programmes and posters, does not have a colour printer and had to 
fundraise to obtain a digital camera. Of the equipment that is available across the 
campus, this was variously described as “rubbish”, “useless”, which “doesn’t work”. 
Special needs was considered by some to be the “Cinderella” of FE, as it was always the 
recipient of “hand-me-downs”, in the shape of ageing and poor equipment. One 
interviewee claimed that “staff have got very despondent just because of the level of 
equipment they’ve been using”. Least negative of the descriptions was that it was “not 
up to scratch”. Implications of this were that: 


* Students were often frustrated as they were often unable to print out their work, 
or, worse, disks on which it had been saved were not recognised by the computer 
when removed from the drive and reinserted later. 


+ Staff always had to prepare a back-up lesson, in case the computer 
malfunctioned or the server was not functioning, etc. (The example was given 
of a lesson planned to teach how to access e-mail and send a message. 
“Unsurprisingly”, the server was “down”, making the entire exercise impossible.) 


+ There was a danger that such problems would discourage students and staff from 
utilizing ICT in the future. This was considered a particular possibility with 
regard to staff. Indeed, one of the interviewees today gave as one of the reasons 
why she rarely uses computers her assertion that “you cannot trust them to work”. 


When the problems outlined above happened, it often took too long for them to be 
rectified, as there were too few technical support staff employed. One interviewee 
suggested that the learning support unit was often placed “down the queue’, reflecting 
a possible lower status for the department than others enjoyed. 


Information needs of the staff 

Questions on this topic were useful to ask as they gave an idea of the context within 
which the staff operated and also provided information as to what might be usefully 
integrated into the LE. 

Needs ranged across many areas. The head of the department had a far greater 
requirement for information than those discussed by other staff members, as befitted 
her extra responsibilities. These included needing much information extraneous to the 
school, such as details of local businesses, social services, etc. She mentioned 
specifically: 

* legislation affecting school policies and procedures; 

* local history, geography, politics; 


* roles of feeder schools, and other details about them (provision for and policies 
about supported learning); 


* social services (particularly important for “transition” pupils — those moving 
from school to work); 


* reports and news from the LEA, including financial changes and regulations, etc. 
and 


* college-specific administrative information, staff policies, etc. 
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Most other members of staff tended to mention only information required for the 
day-to-day running of their classes: 

* administrative procedures and policies; 

* lesson plans and ideas; 

* how to evidence work undertaken; and 


* current level of individual students and areas of the curriculum they still need to 
cover. 


It was rare for any interviewees to mention information needs, which related to the 
medical condition of the pupils. One of those who did so, said she researched a 
conditions on the internet, as appropriate to the students she happened to have in any 
particular cohort. This research helped her a great deal in understanding their different 
behaviours and, thence, to helping them better and also helping herself, such as in 
dealing with aggression, etc. She said that a lot of people simply labelled a person as 
“lazy” or “greedy”, whereas in both cases, there might be medical conditions which 
made them exhibit this behaviour. Another interviewee mentioned requiring 
information about autism, as at one time many of his pupil cohort were autistic — 
as, indeed, was his own son. However, having acquired a certain level of knowledge, 
principally from sources on the internet, he felt that he did not need to continue 
researching as he was not a “specialist”. The head of one of the fieldwork locations felt 
that it would benefit staff to learn more about the various conditions pupils manifested, 
and suggested that the proposed LE feature both general and specific information on 
this subject. 

Meeting these diverse information needs was not regarded as too onerous, 
although the system of student record keeping was described as being too reliant 
on hardcopy, and not easily disseminated. There were also many arch-files of 
lesson plans and other information that were not easily accessible. Much of the 
externally produced information needed by the establishments was actually 
provided by the various agencies, government departments, etc. connected with the 
educational system. The internet was used extensively to search for information, 
although there were no specific sites or sources — people tended to simply “do a 
Google” to search. 

In addition to these sources, a major fount of information was that of colleagues and 
other peers: 


As far as how to deal with, how to teach, and how to train and work with the pupils here, are 
the people who did it last year, and they will tell you about the behaviour traits and about 
their likes and dislikes, personal foibles and you can then adjust the way you teach and 
interact with them accordingly. 


Experience with ICT 
Three factors appear to be in play, although the third tended very much to be a 
function of the second: 


(1) job-role; 
(2) personal interest; and 
(3) training undertaken. 


x 





With regard to the first point, most of the teaching staff interviewed felt that ICT could 
be very well employed within their job roles — in fact, one senior lecture went as far as 
to say that “everything is done on the computer”. This contrasts with studies showing 
poor take-up and usage of ICT among teachers (Zhao and Cziko, 2001; VanFossen, 
2001). Only the catering staff did not feel it necessary/appropriate to use computers or 
other ICT with their students as most of their work involved manua experience, and 
these were the staff members who appeared to have less experience with and be less 
confident about using computers. Nevertheless, both teachers involved in cookery 
classes felt it was a good thing that computers had been introduced generally into the 
school setting, despite problems in usage described above. 

Personal interest, of course, played some part in teachers electing to use 
computers. Indeed, the apparent lack of formal ICT training available seemed to 
have given rise to the situation whereby it was those who had a personal interest in 
computers who pioneered usage. All staff said they were self-taught, generally at 
home, and usually with the aid of a member of the family. Typical was the 
experience of another staff member who said: “I had some initial training, but ... 
most of it’s just, either picked up from other people, or self-taught. I haven’t done 
any proper courses as such”. Similarly, “I’ve kind of learnt as I've gone along, 
people show me things and then I’ve kind of just, it’s just like hands on experience 
is probably the expression I’m groping for”. ; 

Some training was undertaken, albeit on the personal initiative of the staff member 
concerned rather than as a result of school policies. One of the teaching assistants had 
undertaken a number of computer courses in her own time and was now adept at 
spreadsheets, databases, and image editing software as well as word processing. 
Another had undertaken basic ICT training in Microsoft Office applications. However, 
this was not common. 


Uses to which ICT is put 
A number of ICT uses were mentioned. These can be divided into usage by the member 
of staff, and usage by or with students. 

With regard to the first of these, the common activities mentioned were: 


* lesson plan writing; 
. * finding information/images from the internet; 
* report writing; 
* keeping records up to date; and 
* meeting the information needs outlined above. 


Usage with or by students was: 
* using PowerPoint, not necessarily to give presentations, but mainly as a 
showcase for their work; 
* writing exercises (e.g. plans of visits, etc.); 
* searching the internet for information or images of interest {one example was 
that whereby the lesson topic was “appropriate clothing”, and images were found 
of, for example, protective clothing); 
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* using “publisher” to create pages (mentioned by the staff member who had been 
on ICT courses); 


* using whiteboards to showcase student work and as a demonstration tool; and 
* using digital cameras, about which more below. 


With regard to the use of cameras, it is quite common nowadays to have a video 
capacity of about four or five minutes in a digital camera. One interviewee reported 
using these in homework, whereby the students take them home, use it to video 
something they like, bring it back-to.school, and show it to the class. Still pictures are 
also taken regularly by students, not least because of the trans-active work being 
undertaken in both locations. Trans-active is a Mencap/University of East London 
product, which uses ICT to help people with learning difficulties communicate. The 
major activity is to compile an electronic passport, which consists of photo images 
representing likes, dislikes, needs, achievements, etc. of individual students. 
Importantly, staff reported that students were able (sometimes with the help of 
staff) to upload photos to a computer, access them later and print them out: “so they’re 
getting their ICT, not through word processing ... and they really like that, because it 
is about them”. 

None of the interviewees mentioned, with regard to student usage, either 
game-playing or using e-mail to communicate. That may not imply that these 
activities are not undertaken — these areas need to be explored in further visits. 

In addition to work currently being carried out using ICT, there were suggestions as 
to how this could be expanded in future. One interviewee mentioned using digital 
cameras which could link to the proposed “scrapbook” facility on the LE and would be 
applicable in a wide range of subject areas. The example given was that of cookery or 
mechanics classes, where photos could be taken of each step of a procedure — boiling 
eggs, changing a fuse — and uploaded to a “holding” place. Then in the next class, as a 
reminder/reinforcer to students with memory difficulties, the photos could be viewed 


the following week. Clearly, as was acknowledged, this “tactic” would be applicable in . 


a range of other settings requiring tools, utensils or other objects. 

Accompanying accounts of use and aspirations for potential use were concerns 
about ensuring that activities undertaken were at a level that was appropriate for the 
students. As was pointed out: 


A lot of (the) time, even things like Word and that [e.g. other applications] aren’t always 
appropriate to the students’ ability. So it’s (necessary to) have stuff that they can access that 
suits their abilities so they can be engaged in using ICT in that way. 


Benefits of using ICT ; 

Despite worries that students required computer-based work commensurate with their 
intellectual — and, indeed, physical — levels, staff were generally effusive about the 
benefits of using ICT. These included: 


* enhancing the paper-based work of illiterate pupils; 

* obviating problems of manual dexterity; _ 

* having access to a vast repository of images and other material; 
* improving oral communication; 














* evidencing work: and 
* helping to bring pupils into the process of evidencing and evaluating work. 


Regarding the first point, one interviewee spoke enthusiastically of computers 
liberating special needs pupils, as they could be used with those who are unable to 
write or who can write but are very poor spellers. Another mentioned the “wonderful” 
PowerPoint presentations even pupils with severe learning difficulties can produce. 
These were the source of “real pride” for students. 

An activity that was difficult for those with limited manual dexterity was cutting 
out hard-copy images — which had been undertaken in the past from catalogues and 
magazines, etc. The “drag and drop” manipulation of electronic images clearly 
obviated these difficulties. Also, of course, the problem of collating and storing 
physical “scraps of paper” disappears in a digital environment. It was, clearly, not true 
to say that there were no difficulties with physically using computers, but despite those 
that did occur, it was generally felt that these were not as difficult to overcome as 
manipulating scissors, glue, etc. Indeed, as one member of staff noted, it was possible 
to install accessibility devices onto computers, but it was rather more difficult to adapt 
scissors. 

The internet was seen as being immensely beneficial. It was viewed primarily as a 
vast storehouse of images and information, freely available to plunder at will. It was 
very common for students to “surf” the web for images of favourite pop artists or 
characters from TV serials, and many of the more able learners were able to copy 
images and paste into Word or PowerPoint. Even those who could not master this were 
pleased to be able to access images, usually with the help of a teacher or assistant 
entering the search query. Staff also used the internet in this way: 


` In the past, I would have to keep piles of magazines and catalogues, to find photos of things 
I needed pictures of to help in my teaching. Now, if I am doing, say ... seasons, I can get a 
picture of an autumn tree or some snow in no time. 


One distinctive reason for a particular interviewee’s enthusiasm for ICT was his 
experience, some years previously, of being involved with a video-conferencing 
initiative which paired the students from two different schools (albeit only a room 
separated each group in the exercise). He found that the communication level of the 
students was considerably improved through the medium. Students who would hardly 
speak normally seemed to “open up” when talking on camera. Indeed, one whose usual 
response was simply to repeat the question actually gave full and coherent answers 
when using the conferencing system. It may be that speaking face-to-face with 
someone was more intimidating than doing so to an inanimate camera. The forerunner 
to the proposed LE, the UEL/Mencap “transactive” system, along with other initiatives, 
also recognised the potential for enhancing learners’ communication skills by using 
ICT, and was cited by those interviewees who had used it for helping users express 
preferences and opinions. 

Many felt that the process of evidencing pupils’ work was facilitated by computers. 
One interviewee envisioned a series of electronic templates or forms, “where you can 
- put in “this is what they have to achieve” and then “this is how they have achieved it” 
and maybe some text and a photo, or something like that, it’s almost like a virtual 
portfolio they have of what they’ve done at college”. This practice would meet the 


Using ICT with 
SEN students 


549 





550 





standards of the exam boards, it was said, in addition to meeting the student needs. 
It was also useful for evidencing aspects of personal development, as well as 
straightforward academic targets. This would be in the form of electronic diaries and 
representations of what the pupils like/dislike, can/cannot do alone, etc. as well as 
recording developing relationships and interests. 

Evidencing work in this way, it was felt, would also be very useful in overcoming a 
certain tension that exists between some parents and the school (or individual staff), 
resulting from different perceptions and views of what the learners are actually 
capable of. In a recent study by McConkey and Smyth (2003), parents viewed many 
everyday tasks as hazardous for their cognitively disabled children (such as crossing 
the road, being at home alone, using household appliances, etc.). Forty-four per cent of 
the parents the researchers interviewed felt that their children would be unable to do 
any of a set of such tasks. The present writers found that staff claimed that in some 
cases, parents were “over-protective”, and tended to do many everyday tasks for their 
offspring (such as feeding, dressing, etc.). It was useful to have video-records of these 
things being undertaken independently, for use as evidence to doubting parents or 
carers. Unhappily, staff also maintained that it was actually in the interests of some 
parents to have their children as dependent as possible, as they received biggerstate 
allowances as carers. 

It was mentioned that not only was the evidencing of progress facilitated by 
computers and other information technology, but that bringing the pupils in on the 
process was also made easier. Learners could actually see their own improvements — 
both an online diary and the video-clips mentioned above could be used as a powerful 
tool with which pupils’ progress and work covered would act as aids for one-to-one 
progress sessions with a tutor or SLA. 

It is also worth mentioning that, despite these stated advantages, the exigency of 
having to produce evidence seems to create its own pressures. Evidence, of course, has 
to be a concrete representation or documentation of educational achievement. Clearly, 
this is easier to provide by showing a product of a Jesson (e.g. a completed electronic 
“passport”) than by an abstract process. Thus, the need to upload passport images 
takes precedence over facilitating and inculcating the choices made by the students 
with learning difficulties. Observations by the research team show that teachers often 
bypassed student input in order to complete a task and have the cherished evidence 
complete. 


Constraints and barriers to using computers 
A number of difficulties were reported, in addition to the poor functionality of ICT 
equipment, as outlined above, in the section on the working environment. These were: 


* lack of experience of various applications/operations; 

* mistrust of the accuracy of information on the internet; 

* material accessed and used by the students which is too advanced or otherwise 
inappropriate; 

* a loss of privacy; and 

* lack of technical support. 


Lack of experience of various apphcations/operations. As with any activity, lack of 
usage and reinforcement leads to people forgetting how to use certain applications or 








undertake various operations within them. One respondent used a computer as a 
“glorified typewriter” only, and claimed to forget things she has learned (generally 
through her son) by not using them. Indeed, the interviewee who made this observation 
went on to say that she found it easier to puey cut out photos and mount them on 
paper. 

Mistrust of the accuracy of information on 1 the internet. It was hard to discriminate 
between accurate and false information on the internet. There was a danger of teaching 
the students some piece of information acquired from the internet that subsequently 
turned out not to be true. 

Material accessed and used by the students which is too advanced. It was said that 
students were often coming to class — or producing in class — printouts of internet 
pages that were far too advanced for their level of understanding. Also, sometimes the 
material was also not particularly useful in terms of the lesson at hand (although, of 
course, such pages might be interesting in their own right). 

A loss of privacy. One member of staff felt that the current requirement to put her 
schedule on the web was intrusive, and she felt that others having “unlimited” e-mail 
access to her also infringed her privacy. 


Conclusion 

This small scale study has examined the working environment within which teachers 
of SEN students’ work, and elicited some of the major issues. The working 
environment was found to have changed greatly in recent years, in many ways 
positively. There is now a more formal and structured curriculum for learners with 
special needs, and attempts at activities designed to foster inclusion have been 
successful. Difficulties faced by teachers included a lack of and poorly functioning 
equipment, a paucity of appropriate learning materials, unusual challenges posed by 
students’ unpredictable and non-linear intellectual progression and, of course, 
problems of work pressures such as staffing levels and occasional (albeit on-going in 
some cases) difficulties with parents. 

The information needs of teachers were found to be overwhelmingly practical — 
guidance on administrative procedures, having lesson plans ready, mechanisms for 
evidencing work, etc. It appeared that the day-to-day demands of the job precluded 
serious investigation by staff into the various conditions their charges might have, 
unless seriously affecting their classroom performance. Only a small minority of 
those interviewed did this. The wider issues of legislation and of the interface 
between the school and social services, etc. were very much the domain of the 
head. 


constituted a useful, additional teaching tool in their work. Advantages ranged from 
enhancing the learning experience and product and “liberating pupils” from problems 
such as physical cutting and pasting, to evidencing work for external assessment and 
parental consultations. However, most interviewees expressed concern about the low 
quality of their equipment (e.g. computers, networks), the fact that they do not receive 
very satisfactory technical support, or the simple fact that they do not have the 
necessary hardware or software. 

As noted both by interviewees, observation by the research team, and, if implicitly, 
in government legislation[2], the category “learning disability” encompasses a range of 


Most interviewees, even the techno-phobic ones, felt that computers and the internet 
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different conditions which, therefore, require a range of educational and access 
requirements. The LE being produced, reflecting this, will be designed so that it allows 
users to access multimedia content, learning materials, games and so on appropriate to 
their individual level of ability and special needs. In order to do this, information about 
users’ abilities, preferences and access needs will be input into the LE and used to build 
up a profile. This, in turn, will be used to filter appropriate learning materials from a 
content management system, and deliver them to the users’ screens in the form of a 
series of illustrated links to activities that match the specific profile. _ 

Following the research described in this paper, the teacher section will include, most 
importantly, mechanisms for evidencing work, lesson plans — classified according to 
cognitive and accessibility needs — and online forms to monitor progress. The Head of 
one of the fieldwork locations suggested, as noted earlier, that information about 
specific conditions be housed on the LE too, in order to encourage staff to learn more 
about their pupils and the origins and causes of their difficulties. 

It is hoped that the LE being developed will help greatly in enhancing both the 
learner and the teacher/helper experience, such that all the potential benefits of the 
system outlined by interviewees are realised, so that the learners are able to progress at 
their own pace and work with material of their own personal interest, and the staff can 
monitor and evidence progress more comprehensively and easily than hitherto. 


Notes A 
1. For example, The Disability Discrimination Act 1995, The Disability Rights Commission 
(DRC) Act 1999, Special Educational Needs and Disability Act, 2001, and The Disability 
Discrimination Act 2005. 


2. The UK government’s Disability Discrimination Act 1995, for example, defines a disability 
as “a physical or mental impairment which has a substantial and long-term adverse effect on 
his ability to carry out normal day-to-day activities”. This definition applied also to the 
Special Educational Needs and Disability Act 2001 (www.opsi.gov-uk/acts/acts2001/ 
20010010.htm). 
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Abstract 


Purpose — To provide a review of the log analysis studies of use and users of scholarly electronic 
journals. 

Design/methodology/approach — The advantages and limitations of log analysis are described 
and then past studies of e-journals’ use and users that applied this methodology are critiqued. The 
results of these studies will be very briefly compared with some survey studies. Those aspects of 
online journals’ use and users studies that log analysis can investigate well and those aspects that log 
analysis can not disclose enough information about are highlighted. 

Findings — The review indicates that although there is a debate about reliability of the results of log 
analysis, this methodology has great potential for studying online journals’ use and their users’ 
information seeking behaviour. i 

Originality/value — This paper highlights the strengths and weaknesses of log analysis for 
studying digital journals and raises a couple of questions to be investigated by further studies. 


Keywords Electronic journals, Information searches 
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Introduction 

Finding about the usage patterns of scholarly journals has been important for both 
librarians and publishers for a long time. Libraries’ interest in the use of journals is 
twofold. First, research and academic libraries spend the biggest portion of their 
acquisition budget on serials. Thus the statistics of the Association of Research 
Libraries (ARL) show that serial expenditure is a little more than double the 
expenditure on monographs. Additionally, ARL statistics show a 260 per cent rise in 
serials expenditure (compared to a 66 per cent rise in monograph expenditure) between 
1986 and 2003, while the amount of serials purchased was in decline until 2001 
(Kyrillidou and Young, 2003). In the UK, between 1996-1997 and 2000-2001 the 
proportion of university library information resource expenditure on journals 
increased from 47 to 52 per cent, but this increase has failed to maintain current levels 
of journal subscription (House of Common, 2004). Secondly, virtually all academic and 
research libraries are moving towards electronic access to journals. Thus the budgets 
of American academic libraries for electronic journals rose more than 1,800 per cent 
from 1994 to 2003 (Young and Kyrillidou, 2004). In an information environment that is 
dominated by electronic access, users who have the world of knowledge at their 
fingertips are physically disappearing from the librarians’ view. Users have become 
virtual and anonymous. Therefore, obtaining an understanding of the usage of 
electronic journals, and the information seeking behaviour of users is of great 





importance for both libraries and publishers. Librarians need to know about different Use and users of 


aspects of use (including the quantity, patterns and the quality of use) in order to be 
able to justify their expanding budgets, improve their services and increase the value 
they add to their mother organisations. 

From the perspective of publishers, Simba predicted a $12.56 billion revenue for the 
whole STM publishing industry in 2005; and journals, which increasingly are accessed 
through online databases, are the leading delivery mechanism for accessing STM 
content in terms of revenue (Jastrow, 2004). It is of crucial importance for publishers to 
track the changes on the demand side of the market and know about the use of their 
products. As a matter of fact, some of the most important studies about the usage of 
journals have been sponsored or conducted by large publishers, which is a sign of their 
concern about this issue. The point here that needs to be made is that as a result of the 
migration of the user to the digital environment publishers have inherited all the 
knowledge of the user the librarian once had. 

So far many studies with different objectives have been conducted on the use of 
scholarly journals. Before the advent of online journals[{1], most of the studies on 
journal usage were based on citation analysis, reshelving data or questionnaires. Each 
of these methods has its own limitations. Citation analysis does not represent all of 
journal usage as authors do not cite all the articles they read and, moreover, not every 
journal reader is an “author”. Reshelving data are not accurate, nor do they provide 
sufficient details of use. In the case of reshelving data, it is not possible to distinguish 
between the use of individual articles or the whole journal. They also do not include use 
of personal subscriptions and the type of use. In addition to these two methods, many 
questionnaire-based studies have been conducted. But such studies rely heavily on 
what people think they do or might do — not what they actually do, and that can result 
in misrepresentations. À 

Thanks to the widespread use of computer and network technologies for facilitating 
access to scholarly journals, a new opportunity has emerged for studying journal usage 
and scholarly information seeking behaviour. Computers record or log all user 
transactions in a plain text file known as a “transaction log”. Log files contain data 
about many of the details of the users’ interaction with the system. Therefore, some 
researchers have adopted log analysis to find out about the use of electronic journals in 
terms of both the volume and patterns of use. Although log analysis has opened a new 
horizon for the study of e-journal use and users’ information seeking behaviour, it has 
its own limitations that can be minimised by the way researchers approach this 
methodology and apply it. For example, by processing the raw logs rather than relying 
on proprietary software. 

The aim of this paper is to review previous studies about the use of scholarly digital 
journals, especially those studies that have employed log analysis. The paper provides 
an analysis of what researchers have found using log analysis techniques, the different 
ways they have employed this methodology and briefly compare their results with the 
findings of other methods. The advantages and limitations of log analysis as a 
methodology will first be described and then the previous studies of e-journals that 
applied this methodology discussed. The results of these studies will be briefly 
compared with selected questionnaire survey studies and strengths and weaknesses of 
log analysis studies will be highlighted. Some issues that merit further investigation 
will be suggested. 
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Log analysis 

Background 

Regarding the special application of transaction log analysis in the field of library and 
information sciences, Peters et al (1993) defined it as a “study of electronically recorded 
interactions between online information retrieval systems and the persons who search 
for the information found in those systems”. Covey (2002) maintained that transaction 
log analysis (TLA) was developed about 25 years ago to evaluate system performance. 
However, Peters (1993) with an extensive review of past studies showed that ‘this 
method has been used since 1960 for different purposes. 

Peters divided the development of TLA into three main phases. The first phase 
(from the mid-1960s to the late 1970s) was characterised by placing emphasis on 
evaluating system performance, rather than user behaviour and performance. During 
the second phase (late 1970s through the mid-1980s), TLA was first applied to the 
study of online catalogue systems (OPACS). In general, the early researchers were 
equally interested in how the system was being used (¢.g. which search options were 
chosen, and in what order) and in the searching behaviour of the users (e.g. length of a 
search session, how many and what types of errors tended to be made). The third 
phase of TLA (since the mid-1980s to the early 1990s) was characterised by 
diversification. Some researchers chose to concentrate on specific search behaviours 
(e.g. subject searching), specific user groups (e.g. dial access users), or other types of 
information system/platform (e.g. CD-ROM workstation). It was during this phase that 
replications of previous transaction log analyses and longitudinal studies began to 
appear and also analyses of systems co-existed with analyses of user behaviour, 
sometimes in the same research project. Most of transaction log analyses focused on 
actual use of operational IR systems by public users (Peters, 1993). 

Since the appearance of Peters’ article in 1993, we have witnessed the advent and 
rapid growth of the web and different kinds of web-based electronic resources. Peters 
at the end of his article noticed the need to apply log analysis to the studying of usage 
on the internet. During the last decade, web log analysis has developed out of TLA asa 
method for studying web-based resources. This period can be considered as the fourth 
phase of the development of log analysis methodology. The focus of most researches in 
this phase, especially in recent years, has been on the information seeking behaviour of 
users of a whole range of different web-based resources including OPACs, digital 
libraries, electronic journals, search engines, web sites, and so on. ; 


Advantages and disadvantages 

TLA and particularly web log analysis has several potential advantages and 
disadvantages, which has led to researchers adopting contradictory views about the 
methodology. On the one hand, some researchers such as Gutzman (1999) described 
logs as “treasure troves of valuable information” and researchers from the CIBER[2] 
research group called it the “CCTV of Cyberspace” (Nicholas et al, 1999) and “digital 
fingerprints” (Nicholas et al, 2005a). On the other hand, Fuller and De Graff (1996) 
claimed that “web server log provides a distorted metric of user activity” and some 
other researches even went further. Goldberg (2001) considered web usage statistics 
“worse than meaningless” and Udell (1996) called them “damned lies”. However, the 
effectiveness of log analysis as a method for studying the user depends on several 
factors including the software used for analysis and the objectives of the analysis. 
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Kurth (1993) pointed out four elements of TLA, which form a framework for Use and users oj 
considering its limits and limitations: 


. 


the online system(s) studied; 

the user and the search process; 

the analysis of transaction log data; and 
the ethical and legal issues involved. 


Covey (2002) highlighted the importance of the following points for conducting a 


TLA: 


deciding on the right and most useful usage statistics; 
collecting the right usage statistics; 


getting consistent usage statistics from vendors (it can be time-consuming and 
the size of the dataset can quickly grow to an enormous size); 


analysing and interpreting data (it can be problematic); and 
presenting data in a meaningful way. l 


Advantages ; 

Log data provides researchers with a great opportunity to study users of a system (for 
instance, a digital library) or the system itself. Some of the key features and attributes 
of logs which make them a valuable source of data are mentioned here: 


69) 


(2) 


(3) 


(4) 


Log data are unfiltered and automatically collected (Nicholas eż al, 2001). There 
is no human interference in the process of data collection — just in the 
interpretation. 


Log data are non-intrusive. They provide the researcher with direct information 
about what millions of people have done, not what they say they might, or 
would, do; not what they were prompted to say; and not what they thought they 
did (Nicholas and Huntington, 2003). Unlike a survey in which participants may 
choose or alter or hide their true feelings and usage patterns, in log analysis 
users attitudes toward the system do not affect the results (Peters, 1989). 
Nielsen (1986) mentioned that log studies are a way of unobtrusively looking 
over an end user’s shoulder as he or she has used the computer system. There is 
no problem of low responses or biased samples as no one is likely to refuse to 
take part in the study. 


When combined with survey and interview studies, log analysis is an effective 
way to detect discrepancies between what users say they do (for example, in a 
focus group study) and what they actually do when they use an online system 
or web site (Covey, 2002). In fact, log analysis is a suitable method for raising 
evidence-led questions to be asked in questionnaire surveys or interview studies 
from the users. 

Log analysis is an efficient way to gather longitudinal usage data. There is no 


time limitation as long as the log files exist. Also there is no need for sampling 
in log analysis (Nicholas eż al, 2005a). 
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(5) Log analysis is a good way to test hypotheses; for example, to determine 


(6) 


(2) 


whether the placement or configuration of public computers in the library 
affects user behaviour (Covey, 2002). 


Log analysis is an efficient evidence-based method for evaluation of the 
performance of a system such as a digital journal library against its objectives. 


Log analysis is a suitable method for studying and comparing information 
seeking behaviour of user groups. Log data provide the researcher with detailed 
information about different aspects of information seeking behaviour of users 
such as time of use, type of material used, navigational patterns, and so forth. 


Limitations 
Generally, the main problems and limitations of log analysis can be classified as 
follows: 


(1) The difficulty in differentiating user performance from system performance. 


(2 


aS 


Although differentiating between system performance and user performance 
has been considered as one of the fundamental goals of not only log analysis, 
but also all information retrieval studies (Hancock-Beaulieu ef al, 1991), it is 
difficult and sometimes impossible to distinguish these two. For example, the 
distinction between system response time and user “think time” is not obvious 
or clear (Kurth, 1993). In the case of web log analysis, it should be borne in mind 
that it is computer or computer networks which are the virtual users of the web. 
Log files of web servers record the action of these computers and computer 
networks and not directly the action of end users (Nicholas et ai, 2000). 


User identification difficulties. The difficulty in identifying users is another 
factor that inhibits applying log analysis to studying user behaviour. Kurth 
(1993) pointed out the possibility that a user may move from terminal to 
terminal while using a system or two users may alternatively use a single 
terminal. This is very likely to happen with public terminals, located in libraries 
and the like. The usage data from these terminals represents a group of users. 
This problem is exacerbated by the difficulties of distinguishing between 
sessions — internet users do not officially log-off. Identification of web users is 
more complicated and problematic than in traditional information retrieval 
systems. Specifically: 
* The web surfing population includes millions of people from all around the 
world who have access to the internet. The larger the.user population, the 
more difficult it is to identify individual users or user groups. 


There are big problems with identifying users via their internet protocol (IP) 

. addresses — the main approach to identifying users. Some internet service 
providers (ISP) use dynamic addressing. For instance, one user may connect 
to the network with two different IP addresses in two separate sessions or 
two users may be allocated the same IP at different times (Bauer, 2000). Even 
the geographical distribution of the users cannot be accurately determined 
because geographic information depends on where an IP address was 
registered. But a user’s PC may be located in a different -geographic location 

. from where its IP address was registered (Haigh and Megarity, 1998). This 
problem occurred in the study of web logs of The Times/The Sunday Times 


Sy 
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newspaper web sites. Although, according to their IP addresses, it seemed Use and users of 


that 60 per cent of the users of the newspaper web site came from the US, 
checking the subscriber database of the publisher showed that only 26 per 
cent of them registered as US residents (Nicholas et al, 2000). 


* The problem of session detection makes this situation more complicated. 
Because nobody logs off on the web, researchers allow for a suitable interval 
~ say 30 minutes, and then assume that the user has logged off and the 
session has ended. This operational definition of a session may lead to 
biased data or misinterpretation of the results. For instance, people may be 
logged on to the web but are not using it (Nicholas et al, 1999). They might 
be drinking a cup of tea or using another application on their machine. Also 
30 minutes does seem generous in the frenetic world that is web searching. 


* Proxy servers are another factor that frustrates user identification. Proxy 
servers are indeed aimed to provide an optimisation that decreases latency 
and reduces the load on servers by caching information on a web server and 
acting as an intermediary between a web client and the web server. For 
example, a corporation configures all its browsers to send requests to the 
proxy. The first time a user in the corporation accesses a given web page, the 
proxy must obtain a copy from the server that manages the page. The proxy 
places the copy in its cache, and returns the pages as the response to the 
request. The next time a user accesses the same page, the proxy extracts the 
data from its cache without sending a request acrcss the internet. 
Consequently, traffic from the site to the internet is significantly reduced 
(Comer, 2000). Therefore, proxy servers represent an aggregate of individual 
users in usage data. 


* Firewalls are another barrier to identifying users or indeed IPs. A firewall is 
a combination of hardware ‘and software that isolates an organization’s 
internet network from the internet at large, allowing some packets to pass 
and blocking others (Kuros and Ross, 2003). This leads to inaccurate 
counting of the number of users and the amount of use by users. 


e Spiders, robots or crawlers are agents used by search engines and 
organisations to log and index pages into a database. Robot activity is 
normally recorded as a use in log files. They can fortunately be identified 
and excluded from the analysis (Nicholas eż al., 2002). 


(3) An absence of knowledge of why the user did what they did. Log analysis, as it is 


clear from its name, just records the interaction between an information system 
and a user whose identity usually is not clear. Therefore, transaction log data 
does not provide us with anything on the users’ perception of their searches. 
Nor can logs reflect users’ satisfaction with the system (Kurth, 1993). Haigh and 
Megarity (1998) also pointed out that log files shed no light on the reasons 
requests were made, user motivations for using, reaction to content, and all 
other qualitative aspects of use. However, it should be mentioned that some 
advanced log analysis methods, such as deep log analysis (DLA) (see below), 
are able to calculate repeat visits, which does provide some evidence of user 
satisfaction, as users would not come back to the service if they do not like it. 
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(4) The inaccurate measurement of the volume of use. Web browser software such 
as Internet Explorer or Netscape cache all pages requested by the user to the 
clients local machine in order to increase the efficiency and speed of browsing. 
Once a web page has been cached by the client’s machine, it will not be 
downloaded from the server for the next request. Consequently, the use of a 
cached page is not logged on the server. This phenomenon leads to biased data 
and it affects both usage data (number of hits) and view time. Fieber (1998) 
maintained that between 35 and 55 per cent of page impressions are not 
recorded in logs because of caching. Caching is essential for the web and 
disastrous for statistics. There are several levels of caching, including browser 
cache, local cache and large regional cache (Goldberg, 2001). Proxy servers, 
which were discussed above are a kind of local cache. Nicholas et al.(2002) in 
their research on health web sites found that the extent of page caching depends 
on the architecture of the site. Haigh and Megarity (1998) mentioned the effect of 
site structure on the results of log analysis too. 


It is worth mentioning here that some of the above limitations and pitfalls of log 
analysis are potential in nature and their occurrence depends on several factors 
including, among others, the kind of software used, the method of data refinement, 
objectives of the analysis, nature of the system which is being logged and so on. In fact 
researchers can minimize the effect of the aforementioned disadvantages on the results 


` by choosing an appropriate strategy for the analysis. 


Scholarly e-journals use and user studies 

There is a rich literature related to the study of scholarly journal use and their users. 
Peek and Pomerantz (1998) reviewed studies on “Electronic scholarly journal 
publishing” and described the future of electronic scholarly journals as “unclear”. 
Pointing out the limited base of research on electronic journals, they maintained that it 
was difficult to determine from the available research how interested the majority of 
scholars were in changing the scholarly publishing process. King and Tenopir (1999) 
reviewed the literature dealing with scholarly journal demand, use, and readership 
over the past 40 to 50 years. They concluded that the high levels of usefulness and 
value of scholarly journals has persisted over the years and scientists continue to read 
a great deal and spend considerable time reading, especially scholarly journals. Kling 
and Callahan (2003) discussed different aspects of electronic journals as a means of 
scholarly communication. They reviewed the current state of the literature about the 
advantages and disadvantages of e-journals and their perception by academics. They 
came to the conclusion that most of the behavioural studies of e-journal use had been 
conducted at research-intensive institutions, and neither focused on primary research 
use or did not distinguish between the ways that scholars use journals (for teaching, for 
review articles, and so forth). They also believed that many studies of e-journal use 
were based on closed-form surveys and, although some of these studies were very 
useful, these kinds of surveys were likely to be too blunt an instrument to answer some 
of the questions about the roles of e-journals in the more complicated scholarly 
constellation. The most recent and comprehensive review of the current state of 
electronic resources use and users is a report by Tenopir (2003) for the Council on 
Library and Information Resources (CLIR). She reviewed and summarised more than 
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200 recent research publications (published between 1995 and 2003) that focused on the Use and users of 


use of electronic library resources. Her review led to some key conclusions including: 
* the rapid adoption of electronic resources in the academic environment; 
+ different usage patterns and preferences among different disciplines; 


* the importance of browsing a small number of core journals for subject expertise, 
especially for current awareness; and 


e that most journal article readings were of articles within their first year of 
_ publication. 


We will review most of the previous studies that applied log or usage data for studying 
use, users, or e-journals. 


Log based studies: researchers’ approach 
The application of log analysis for studying electronic journal use and users goes back 
to even before the advent of online full-text journals. It began in the early 1990s with 
some experimental projects such as TULIP (The University Licensing Program) 
project (TULIP, 1996), which ran from 1991 to 1995. The aim of the project was to test a 
- system for electronic delivery of journal information to a user’s desktop in an academic 
environment and also to find out about users’ behaviour. 

There are not many studies on use and users of online journals that have applied 
logs or usage data. Table I presents brief details of the main studies that have used logs 


or usage data. As can be seen in the table, these studies are varied in terms of time and ` 


coverage. They also differ in terms of the objectives of the study and consequently the 
kinds of analyses. Older projects (i.e. TULIP and SuperJournal) were conducted at a 
time when there were not so many journals available electronically and they were 


aimed to test a system for electronic delivery of journal contents and to find out about. 


the users’ interaction with the systems in order to develop this kind of service. Later 





Number of Analysis Kind of data 
Research Start Time lapse journals software analysed 
TULIP 1991 4 years 4&3 NIA Log 
SuperJournal 1996 2 years 49 SPSS Log 
Zhang 1996 9 months 1 Getstats Log 
Watters etal - 1997 3 months 1 Perl Scripts Log 
Morse and 1999 6 months 194 N/A Log 
Clintworth 
_ Obst 1999 Varied 270 N/A Usage data handed 
(5 to 24 months) by publishers 
Taiwan ` 2000 9 months 1,300 A program Log 
ScienceDirect : with C language- 
FinELib 2001 2 years 8,400 N/A Usage data handed 
by library portal 
Gargiulo 2002 18 month +1,000 SAS Log 
Davis ` 2002 3 months 29 SPSS, Excel Usage data (IP Based) 
eJUSt 2002 1 day 144 NA Log 
CIBER Emerald 2002 1 year +100 SPSS Log 


CIBER Blackwell 2003 2 months +700 SPSS Log 
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Online journal studies 
based on log and 


usage data 
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projects such as Taiwan ScienceDirect Online (TSDO), CIBER’s research, eJUSt, and 
Garguilo’s study evaluated existing services, which are being used by users. However, 
they had different goals and approaches. Some, such as that by Gargiulo (2003), 
focused on use and did simple analysis; some such as TSDO (Ke et al., 2000) focused on 
users as well as use and conducted more detailed analyses and some others, such as 
CIBER’s research (Nicholas et al, 2003, 2005a), dealt with the development of new 
methodology as well as detailed study of use and users. 

Unfortunately not much information has been provided about the method of 
analysis in the publications. In some studies researchers have used either processed 
data obtained from proprietary software (Zhang, 1999) or processed statistics provided 
by the publishers of journals (Davis, 2002; Obst, 2002; Kortelainen, 2004), In a few cases 
(Gargiulo, 2003), researchers used non-proprietary analysis software such as Statistical 
Analysis Software (SAS) or software package for social sciences (SPSS); other analyses 
were not very deep and researchers just presented a few general breakdowns such as 
total PDF download. 

Only in the case of the SuperJournal Project (Eason et al, 2000), that was 
supported by a programming team and different logging software, were details 
recorded in the logs about the users’ information behaviour. Since no single program 
could provide researchers of SuperJournal with an overarching logging, a number of 
logging programs were adopted, each responsible for one or more parts of the 
application. Because of the range of logging software used and the customised 
application that was designed specially for the project, researchers did not face some 
of the typical problems of log analysis, such as caching and problems with proxy 
servers or floating IP addresses. A number of variables were used to create different 
SPSS analyses including: journal usage, journal content usage, and use of browse, 
search and special functions (number of sessions per month for each of these) time of 
use (number of sessions for each day and hour), location of use (number of sessions 
for each location), frequency of use (number of sessions per month), depth of use 
(proportion of sessions at article level), and breadth of use (number of journals used 
at issue-list level) (Yu and Apps, 2000). Although, the diverse methodology (log 
analysis, survey, and interview) and longitudinal aspect were some of the strengths 
of the SuperJournal, the research covered a very small group of subjects, in fact just 
four specific subjects. Also, the user interface with a hierarchical structure that had 
been designed particularly for the project was different from the interfaces of today’s 
digital journal libraries. 

TULIP (The University Licensing Program) project (1996), which began in early 
1991, was aimed to test systems for networked delivery to, and use of journals at, the 
user's desktop and also to find out about users’ behaviour. The project was carried out 
in an academic environment and users included faculty members and students. Four 
types of analysis were carried out on log data: 


(1) an overview of the usage by type (browsing abstracts or page 
images/printing/other) per site per month for all users; 

(2) an overview of usage by all user types per month (faculty, graduate students, 
undergraduates, library staff, staff, others, and unknown); 


(3) the same analysis, but only of the usage of the core user groups: faculty and 
graduate students; and 


v 





(4) amonth by month report on the development of the number of users and repeat Use and users of 
users for the faculty and graduate students. S cholarly 


In the Taiwan ScienceDirect project (Ke et al., 2002), the focus of the researchers was on e-journals 
the details of search queries including query fields, length, operators and refinement, 

and term occurrences. They considered each IP as one user. But they admitted that 

because of using proxy servers and firewalls by subscribed institutions, data seemed to 563 
be partly biased. Although the service was accessed by more than 30,000 IPs, close to —_——-e-ws—_————sssmsmne 
half of the full-text views were requested from just 100 IPs. This indicated that most of 

the users were hidden behind a proxy or caching server. In fact some individual IPs 

represented an institution or a group of users not an individual user. 

Gargiulo (2003) used IP authentication, username and password for user 
identification. She used log analysis just to extract the usage data to evaluate the 
efficiency of Big Deal subscription. In Big Deal subscriptions, publishers usually offer 
access to all their publications as a set price package. She “guessed” that users 
accessed the service from their personal workstations and therefore their log data was 
free from the worst effects of corporate workstation use. The analyses did not include 
any details about users and their behaviours. The only analyses provided were 
breakdowns of article downloads over time (month and hour) and the amount of 
download per title. 

Researchers at Stanford University conducted an electronic journal user study 
(eJUSt) that focussed on HighWire press journals. The analyses mainly included type 
of use (file format, searching, and browsing) and referrals. Their one-day sample of 
web log data of 14 medical and life science journals did not seem sutficient to make 
robust inferences about user behaviour. However, they found some navigational 
patterns in the use of e-journals which we shall mention in the next section (Institute 
for the Future, 2002). 

Davis conducted a number of studies on online journal usage. In the first study 
Davis (2002) investigated two year’s worth of usage data from the North East Research 
Library (NERL) consortium which included more than 200 journals. Instead of 
analysing log data, he was provided with data by Academic Press as summary 
statistics: by journal, by institution and by month. The focus of the study was on 
institutional use rather than usage by individual users or user groups. In another 
study, Davis and Solla (2003) analysed three month’s worth of usage data for 29 
journals from the American Chemistry Society (ACS) downloaded at Cornell 
University. The analysis was based on JP addresses and full-text article download. The 
user was defined in this study as an individual IP address and the main analysis was 
the number of article downloads per journal per IP. The researchers identified the 
library proxy server which was considered as an aggregate user and they also matched 
the IPs with different departments. Accordingly, they made broad inferences about 
information behaviour of academicians in different departments. In a third study, 
Davis (2004a) again investigated three month’s of transaction log data of ACS journals 
downloaded at Cornell University. He limited the analysis just to referral URLs in log 
data to find out about the ways in which users accessed journals. In a recent article 
Davis (2004b) studied usage data of HighWire journals in 2003 for 16 universities in the 
US, UK and Sweden and showed that there was a predictive relationship between the 
number of article downloads and the number of users (ie. the size of the user 
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population can be estimated by just knowing the total use of a journal). He focused on 
two variables: the number of IP addresses as the surrogates for users, and the number 
of downloads (both HTML and PDF). 

Zhang (1999) and Watters et al. (1998) analysed the log data of just one journal. 
Zhang used proprietary log analysis software, Getstats, to study a nine month period 
of log data of the web-based journal Review of Information Science. He analysed the 
data in terms of type of use (TOC, article download, and other features) and 
geographical distribution of accesses as revealed by referrals. Zhang’s study did not 
have a longitudinal aspect and he did not study returnees. Watters et al, in their 
study used PERL programming to investigate three month’s worth of access logs of 
the journal The South Pacific Journal of Psychology. The main focus of their study 
was general use and referrals to find out about the geographical distribution of the 
users, 

Morse and Clintworth (2000) filtered Ovid transaction logs to study six month’s 
usage of 194 journals, which were available both in print and online in the Norris 
Medical Library at the University of Southern California. Because of the goal of the 
study, which was to compare the pattern of usage of print and online journals, the 
analysis was limited to general download of full-text articles. 

Obst (2003) and Kortelainen (2004) both used processed usage data for their 
studies. Obst (2003) studied online usage statistics of 270 journals delivered by five 
publishers. The researcher corrected the data for redundant multiple accesses. The 
aim of the study was to examine the correlation of usage of a matched set of print 
and online titles, the validity of e-journals usage statistics and the impact of online 
journals on print journal usage. Obst noticed that the problem in the usage data of 
most publishers was that the usage and type of access were not defined. Kortelainen 
(2004) utilised the usage data of the electronic journals supplied by The Finnish 
National Electronic Library (FinELib) portal to investigate the relative advantages, 
compatibility, complexity and visibility of e-journals and their effects on usage. The 
results showed that there was a clear difference between the use of the e-journals 
(such as Emerald journals) and article files (those services that provide journal 
articles in full-text form, without really utilizing the advantages of digital publishing, 
such as EBSCO). 

Nicholas and his colleagues in CIBER conducted a series of studies on Emerald and 
Blackwell electronic journals in order to evaluate the impact of the Big Deal on users’ 
behaviour and generally find out about digital journals’ users’ information seeking 
behaviour (Nicholas et al, 2003, 2005a, b). Based on the experience gained from 
investigating consumer health logs, they developed a more sophisticated methodology 
for log analysis called DLA. They paid more attention to users in their analyses and 
highlighted the importance of returnees and bouncers. The strength of their 
methodology is due to the following features: 


* Using SPSS to analyse raw logs instead of proprietary log analysis software. 
Unlike proprietary software that imposes predefined definitions and breakdowns 
on the researchers, SPSS provides more flexibility and enables the researchers to 
define their own variables and breakdowns. 


* Enriching log data with demographic data, such as user data gathered from the 
subscription databases of publishers. 


Y 


4 


a 





* Classifying users based through a combination of their demographic attributes Use and users of 


and their usage. 
* Paying special attention to returnees — users who come back to use the service. 


In an article called “Micro-mining and segmented log file analysis”, Nicholas and 
Huntington (2003) suggested three deep log micro techniques for analysis: 


(1) the construction of a subgroup of users for which researchers can feel confident 
in regard to their geographical origin; 

(2) the analysis of a subgroup of users whose IP addresses were more likely to 
reflect the use of the same individuals; and 


(3) the tracking and reporting of the use made by individuals rather than groups. 


Findings of log based studies 

As we mentioned before, log analysis has been applied for different purposes such as 
assessing system performance, studying users’ searching and browsing behaviours, 
investigating the effectiveness of Big Deal subscriptions, studying literature decay, 
and so on. Different digital journal platforms or libraries also have different features. 
These factors make it difficult to compare the results of the different studies and 
achieve and make generalisations. 

In terms of volume of use, previous log studies have led to different conclusions 
about the success or otherwise of Big Deal and consortium subscriptions to journals. 
While Gargiulo (2003) analysed logs of an Italian consortium and strongly 
recommended Big Deal subscriptions, Obst (2003) saw “no future” for package deals 
on the basis of the results of his comparative study. Davis (2002) challenged the 
composition of geographic based consortia too. He recommended libraries create 
consortia based on homogeneous membership. He suggested that libraries with similar 
users and needs should create a consortium. Nicholas et al (2003) showed that users 
who had Big Deal subscriptions behaved differently from those who did not. Their 
findings indicated that Big Deal users viewed more journals and conducted longer 
sessions. Nevertheless, there seems to be a considerable degree of concentration in the 
use of journals. Morse and Clintworth (2000) found that just 20 per cent of titles 
accounted for nearly 60 per cent of usage. Davis and Solla (2003) revealed that a small 
number of heavy users can have an extremely large effect on the number of total 
downloads. Another study showed that 4.9 per cent of a journal collection satisfied 44 
per cent of downloads and, on the other hand, 59 per cent of the collection represented 
only 10 per cent of the use of the collection (Davis, 2002). This distribution in usage 
data needs to be studied more. The effect of log analysis limitations, particularly the 
problems with caching and proxy servers, on this asymmetric pattern of use is yet to 
be investigated. 


Log studies indicate a relative preference for PDF versions of articles to HTML l 


versions among users (Ke et al, 2002; Davis and Solla, 2003; Nicholas et al, 2003). 
Questionnaire studies confirm this preference and highlight the fact that most users do 
not like reading on the screen (Tenopir, 2003; Tenner and Zheng Ye, 1999; Worlock, 
2002). This indicates that users of e-journals probably choose a PDF version because it 
is more printer friendly and better for archiving. 
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Log studies have been particularly helpful in understanding the searching and 
browsing behaviour of e-journals’ users. The findings of eJUSt project showed that 
there were two major starting points for journal web visits — through journal home 
pages and through PubMed. This differed for individual journals. Entering journal 
web sites through homepages usually led to either browsing contents or searching for 
an article. More users read full text right away instead of reading abstracts first to see 
if articles were of interest; however, certain journals’ users requested abstracts before 
reading full text. Users tend to read full text after browsing contents. Either abstracts 
or full text views in HTML preceded requests for full text in PDF format. Three 
common seeking patterns were found: 


(1) journal homepage — TOC — HTML full text — PDF full text; 
(2) PubMed — HTML full text — PDF full text; and 
(3) journal homepage — search — HTML full text - PDF full text. 


The findings showed that most requests were for full text in HTML, which were then 
followed by requesting the full text in PDF, as if the final goal of most visits was to take 
away a PDF version of an article (Institute for the Future, 2002). Log analysis of 
ScienceDirect OnSite (SDOS) in Taiwan shed some light on the searching behaviour of 
users. The analysis revealed that roughly 32 per cent of all recorded page accesses 
related to full-text accesses, 34 per cent of accesses related to browsing, 13 per cent 
related to searching, and 9 per cent of accesses related to abstract page views. Users 
that started navigation using the SDOS browsing feature had two options available: 
76 per cent of users chose “Alphabetical List of Journals” and 24 per cent chose 
“Category List of Journals”. In terms of search queries, of all users, 42 per cent made 1 
to 20 queries. They found that approximately 85 per cent of queries contained one, two, 
or three terms, although the average query length was 2.27 terms. A total of 91 per cent 
of the queries were of the simple search type, while only 9 per cent of the queries were 
of the expanded search type. “Any Field” was the default query field, matching any of 
the fields that can be searched, and was used in 84 per cent of simple searches. On the 
other hand, about half (49 per cent) of expanded search usage included fields other than 
the default field. Article title, author's name, and abstract were the three query fields 
most frequently used in expanded search mode (Ke et al, 2002). The SuperJournal 
project showed that researchers were not very good at searching (Eason et al, 2000). 
But things have changed in the ten years that have elapsed since the SuperJournal 
project. Recent questionnaire surveys illustrate a tendency among online journals’ 
users to search rather than browse (Sathe et al, 2002; Boyce et al, 2004). Log studies 
provide supporting evidence for this. The analysis of referral logs of chemical journals 
showed that library catalogues and bibliographic databases, which are both searching 
mechanisnis, were the top two sources that led users to journals (Davis, 2004b). 


Conclusion 

The standardisation of usage data and provision of COUNTER (www.projectcounter. 
org) compliant statistics by publishers have helped libraries-have a clearer picture of 
the use of their journal collections. However, processed usage data cannot disclose 
enough information about usage patterns and cannot compete with raw log data. 
Previous studies of online journals employing log analysis demonstrate the great 
potency of this method and the greater variety of its application for studying different 
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aspects of use and users of e-journals. Although there has been an ongoing debate on Use and users of 


the pitfalls of web log analysis, some of the studies indicate that there is every 
opportunity for improving the methodology. DLA methods that have been developed 
by Nicholas and his colleagues in CIBER have opened a new horizon in studying 
e-journal use and users. Based on the experiences of CIBER in employing log analysis 
methods (Nicholas et al, 2005a), several steps can be taken to enrich the log data and so 
obtain more robust data. The first step is to make decisions about how the data are 
defined and recorded (for instance, who is a user, what is a hit, what represents success, 
etc.), re-align as necessary, and assess statistical significance. After this step, which is 
all about deciding on operational definitions, the raw data should be re-engineered to 
provide more powerful metrics and to ensure thai data gathering is better aligned to 
organisational goals. In order to make more sense of log data, the second step should be 
to enrich the usage data by adding user demographic data (e.g. occupation, subject 
specialism), either with data obtained from a subscriber database (ideal) or online 
questionnaires (not so ideal, as user data cannot be mapped so closely on usage data). 
Categorising the users into smaller groups rather than looking at a broad picture of the 
usage and tracing the usage by some individual users as case studies help achieve a 
deeper knowledge of usage patterns and users’ behaviour. Finally, to strengthen the 
results of log analysis and test the findings, some questionnaire, interview or 
observation studies should be conducted to explain the information seeking behaviour 
of the users discovered in the logs. Since log analysis provides little in the way of 
explanation, satisfaction and impacts, and rather raises the questions that really need 
to be asked, a survey or qualitative study needs to be done to find the answers to the 
questions. 

Log analysis is clearly useful for certain kinds of analyses, like shedding light on the 
format of the articles scientists read (PDF or HTML), the age of the articles 
(obsolescence), and the way scientists navigate to the required material (searching and 
browsing behaviour). But log analysis is not all that helpful at discovering the value 
and use of the articles retrieved, or about what lies behind expressed information 
seeking behaviour. Although CIBER researchers have succeeded in enriching log data 
with demographic data in order to find out about the information seeking behaviour of 
different users, so far log analysis has not been a very efficient technique for finding 
out about the differences of information seeking behaviour among users from different 
subjects, or about the effects of the status of users on their information seeking 
behaviour. These are areas in which log analysis methods must be improved. The 
results of log analyses should be enhanced by triangulation of the findings of studies 
with other methodologies. 

- A number of areas suggest themselves as fruitful for investigation, employing a 
combination of log analysis, questionnaire surveys and observation studies: 


* Whether the electronic information environment led to a 24/7 culture? Though 
past log studies showed low usage of e-journals in out-of-work hours (Nicholas 
et al, 2003), the reason(s) for this low usage, differences in information seeking 
behaviour of users during different times and at different places (office or home) 
need more investigation. 

* What are the disciplinary differences in terms of journal use? How different are 
the usage patterns of users with different status? Most of what we know about 
these questions comes from self reporting surveys. Combining demographic data 
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with logs can provide a better picture of the differences of information seeking 
behaviours among users in different subjects and with different status (eg. 
student, professor and researcher). 


* How do users navigate to the articles they want? What are their preferred 
methods of getting to articles, their bookmarks, searching databases, journal 
web sites or libraries’ e-journal list? How does the task affect the way the users 
seek articles? 


* What is going on in regard to pay by view? Do those who buy articles behave 
any differently from those who have free access to journals? 


Notes 
1. In this paper we use the terms “online journal”, “digital journal”, and “electronic journal” 
interchangeably, and by them we mean all scholarly journals which are available in 
electronic format on the web, no matter if they have print equivalent or not. 


2. Centre for Information Behaviour and Evaluation of Research (www.ucl.ac.uk/ciber). 
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Cristina Hastings 
London Business School, London, UK 


is the recipient of the Journal’s Outstanding Paper Award for Excellence for her paper 


“Discussion of performance measures in public service broadcasting” 
which appeared in Aslib Proceedings, Vol. 56 No. 5, 2004 


Cristina Hastings’ research in performance measures began during her five year period at Accenture, where, 
alongside client assignments, she contributed to innovative thinking in measuring business performance and 
the challenges of tying these to stakeholder wants and needs. At Proteus, a media consultancy, she 
subsequently developed their internal performance measurement system, building this alongside the career 
progression structure. Cristina wrote this article whilst studying for her full-time MBA at London Business 
School, and following assignments within BBC Strategy and BBC New Media. Cristina now works in Strategic 
Planning for AstraZeneca UK. She was educated at Oxford University (MA) and London Business School (MBA). 
She is a keen tennis player, playing for The Queens’ Club, and competes in triathlons. 





Note from the publisher 


Outstanding doctoral research awards 

As part of Emerald.Group Publishing’s commitment to supporting excellence in 
research, we are pleased to announce that the first annual outstanding doctoral 
research awards have been decided. Details about the winners are shown below. 
The year 2005 was the first year in which the awards were presented and, due to the 
success of the initiative, the programme is to be continued in future years. The idea for 
the awards, which are jointly sponsored by Emerald Group Publishing and the 
European Foundation for Management Development (EFMD), came about through 
exploring how we can encourage, celebrate and reward excellence in international 
management research. Each winner has received €1,500 and a number have had the 
opportunity to meet and discuss their research with a relevant journal editor. Increased 
knowledge-sharing opportunities and the exchange and development of ideas that 
extend beyond the peer review of the journals have resulted from this process. The 
awards have specifically encouraged research and publication by new academics: 
evidence of how their research has impacted upon future study or practice was taken 
into account when making the award selections and we feel confident that the winners 
will go on to have further success in their research work. 

The winners for 2005 are as follows: 


e Category: Business-to-business marketing management. 
Winner: Victoria Little, University of Auckland, New Zealand. 
Understanding customer value: an action research-based study of contemporary 
marketing practice. 


Category: Enterprise applications of internet technology. 

Winner: Mamata Jenamani, Indian Institute of Technology. 

Design benchmarking, user behaviour analysis and link-structure personalization in 
commercial web sites. 


Category: Human resource management. 
Winner: Leanne Cutcher, University of Sydney, Australia. 

. Banking on the customer: customer relations, employment relations and worker 
identity in the Australian retail banking industry. 


e Category: Information science. 
Winner: Theresa Anderson, University of Technology, Sydney, Australia. 
Understandings of relevance and topic as they evolve in the scholarly research process. 


Category: Interdisciplinary accounting research. 

Winner: Christian Nielsen, Copenhagen Business School, Denmark. 

Essays on business reporting: production and consumption of strategic information 
in the market for information. : 


Category: International service management. 

Winner: Tracey Dagger, University of Western Australia. 

Perceived service quality: proximal antecedents and outcomes in the context of a high 
involvement, high contact, ongoing service. 
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Category: Leadership and organizational development. 
Winner: Richard Adams, Cranfield University, UK. 
Perceptions of innovations: exploring and developing innovation classification. 


Category: Management and governance. - 


Winner: Anna Dempster, Judge Institute of Management, University of Cambridge, UK. 
Strategic use of announcement options. 


Category: Operations and supply chain management. 
Winner: Bin Jiang, DePaul University, USA. | 


Empirical evidence of outsourcing effects on firm’ s performance and value in the 
short term. 


Category: Organizational change and development: 

Winner: Sally Riad, Victoria University of Wellington, New Zealand. 
Managing merger integration: a social constrigctionist Perspective. 
Category: Public sector management. 


Winner: John Mullins, National University of Freland, Cork. 
Perceptions of leadership in the public library: a transnational study. 


Submissions for the second Annual Emerald/EFMD Outstanding Doctoral Research 
Awards are now being received and we would encourage you to recommend the 
awards to doctoral candidates who you believe to; have undertaken excellent research. 
The deadline by which we require all applications is the 1 March 2006. For further 
details about the subject categories, eligibility aad submission requirements, please 


visit the web site: www. emeraldinsight.com/info/researchers/funding/doctoralawards/ 
2006awards. html. 
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Aslib imi 


INFORMATION MANAGEMENT 


LEVERAGE THE ASLIB KNOWLEDGE BANK 


TO ADVANTAGE! 


Information and knowledge are the resources to be managed and leveraged for success today. 
Signing up for an Aslib membership package will give you access to some of the best talent 
and expertise in the business. 


That talent and expertise can be accessed 
through a number of different services: 


JOURNALS SUBSCRIPTIONS 


Athens-authenticated access to the most 
highly cited journals in the information world. 


MANAGING INFORMATION MAGAZINE 


A leading magazine in the information world 
focusing on practical tips and solutions, news, 
comment, features and analysis on the topics 
that matter - knowledge management, 
taxonomy, freedom of information, data 
protection, intellectual property, information 
auditing, content management — and much 
more. 


NEW JOB VACANCIES SERVICE 


A fresh place for employers to find their dream 
candidate, and those seeking new challenges 
to find their dream job at: 
www.managinginformation.com/recruitment. 
htm 


THE ASLIB TRAINING SERVICES 
PORTFOLIO 


Corporate members benefit from a 20 per cent 
discount on all training courses held in the 
Aslib training centre. Courses include: 


e Building and deploying corporate 
taxonomies 


* The 10-step marketing toolkit, 


¢ Understanding and assessing information 
needs í 


e Electronic serials management etc. 


Aslib also provides bespoke training in the 


workplace, using our pool of leading expert 
trainers with advantageous terms for Aslib 
corporate members. 


THE PROFESSIONAL NETWORK 


Aslib also has a professional network of 
special interest groups and branches which 
hold meetings on core topics where members 
can exchange ideas and tips, listen to leading 
experts, and hone their information 
management skills in a supportive 
environment. 


Aslib is a lean organization, swift to respond 
and free from expensive and cumbersome 
bureaucracy. We recognize the challenges 
which organizations face in getting the best 
from their information assets, and we know 
how to meet those challenges successfully. 
Join us, and reap the advantages that our 
network and services can deliver to you. 


“To find out more contact Saul Frazer, membership manager, on 020 7613 3031; 
e-mail: sfrazer@aslib.com or visit the Aslib web site at: www.aslib.com 
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Make the most of articles highlighting best practice in your organization. - 
Perfect for your: | 
“The quality and turnaround time of the : : | 
author reprints supplied by Emerald e Sales literature. ‘ 
were excellent. They were professional Direct mail ; l 
and friendly to work with and delivered Saal ce lace 
on time and to budget. We have since © Trade shows hand-outs. ' 
used the reprints ina successful direct eae l 
marketing campaign.” al e Exhibition displays. `: 
` Kay Baldwin ~ Evans Head of Marketing, Our high-quality custom reprint service is an ideal way to add new dimensions to the i 
EMEA Skillsoft International, UK marketing of your company pmd helps you to achieve that vital competitive advantage. 
Put Emerald article reprints to work for your organization s 


. ; © Use reprints as marketing tools at conferences, trade shows, presentations, etc. } 
“We're impressed with the reprints you i : 


put together for us. Even better, our e. Create brand awareness through the addition of corporate logos. ; z ; 


clients and prospects have been i f \ izati : d d ices by addi 
impressed. The reprints help reinforce Increase exposure for your organization, employees, products and services by adding 


our image as a thought leader in our full page company advertisements. 
industry. Thanks forsucha - 
professional-looking piece.” 


e Accentuate your company’s credibility in the marketplace. 


a : . © Endorse your company — reprints make powerful testimonials. ar 
John Covington, ' ‘ 


President, Chesapeake Consulting, USA > œ Provide your customers, partners and suppliers with a strong, lasting impression of 
your corporate excellence. 


e Articles include information on many companies including Fortune 500 companies. 


e Training companies can use up-to-date Guru interviews or case studies to back up f 


“ i th 
‘il was extremely impressed by the fast training themes. 


and professional service which I 
received from Emerald. The quality of 
the prints was of a very high standard 
and they have provided us with an Emerald articles may be reprinted in black & white, or full colour. 
impressive mailshot to send to potential Reprints can be supplied in quantities to meet your requirements, from - 


clients. | would like to commend a minimum order of 100 copies. Web rights also available on request. 
Emerald for their excellent service and $ 


would highly recommend their To order, or for further information on article reprints, please contact 
services.” Anne-Marie Thorslund j 


Jemma Cotterill, Tel: +44 (0) 1274 785139 
Sane aae pmen PEE TA E-mail: athorslund@emeraldinsight.com 
PSE Advancement Lte, www.emeraldinsight.com/reprints 
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Aslib Proceedings: New Information Perspectives 


Copyright . 

Articles submitted to the journal should be original contributions and 
should not be under consideration for any other publication at the 
same time. Authors submitting articles for publication warrant that the 
work is not an infringement of any existing copyright and will 


` indemnify the publisher against any breach of such warranty. For ease 


of dissemination and to ensure proper policing of use, papers and 
contributions become the legal copyright of the publisher unless 
otherwise agreed. Articles should be submitted by e-mail or on disc to: 


The Editor 

Professor David Nicholas, Professor of Library and Information Studies 
and Director of the School of Library, Archive and Information Studies, 
University College London, 1 Gower Street, London WC1E 6BT, UK. 
E-mail: david.nicholas@ucl.ac.uk 


Editorial objectives 

Aslib Proceedings: New Information Perspectives brings currency, 
authority and accessibility to the reporting of current research, issues 
and debates in the broad area of information work. Above all, the 
journal wishes to provide research and comment in a form that is 
easily and quickly understood: a fresh, rigorous, but unfussy, writing 
style is what is aimed for. The needs of busy practitioners ‘and 
academics) are very much in mind. Articles are refereed to meet the 
need for accredited and authoritative information. To ensure that this 
does not result in an unacceptable loss of currency, articles are 
speedily refereed by the joumal’s large, eminent anc multi- 
disciplinary Editorial Board (assembled for just such a purpose). In 
some cases they may be referred to an external panel of consultants. 
The submission of research in progress articles is encouraged. 

The boundaries of information work are rapidly changing and 
widening. Aslib Proceedings has a role to play in the repositioning of 
the information profession by encouraging submissions ‘rom 
information managers, providers, packagers and consumers of all 
kinds — and especially those from the related fields of journalism, 
electronic publishing, communication and Internet studies. 


Manuscript requirements 

As a guide, articles should be between 2,000 and 8,000 words in 
length, A title of not more than eight words should be provided. A 
brief autobiographical note should be supplied including full name, 
affiliation, e-mail address and full international contact details. 
Authors must supply a structured abstract set out under 4-6 
sub-headings: Purpose; Methodology/Approach; Findings; Research 
limitations/implications (if applicable); Practical implications (if 
applicable); and the Originality/value of paper. Maximum is 250 
words in total. In addition provide up to six keywords which 
encapsulate the principal topics of the paper and categorise 
your paper under one of these classifications: Researct. paper, 
Viewpoint, Technical paper, Conceptual paper, Case study, Lterature 
review or General review. For more information and guidance on 
structured abstracts visit: www. emeraldinsight.com/info/authors/ 
writing_for_emerald/submissions/structured_abstracts.html 

Where there is a methodology, it should be clearly described 
under a separate heading. Headings must be short, clearly defined 
and not numbered. Notes or Endnotes should be used only if 
absolutely necessary and must be identified in the text by 
consecutive numbers, enclosed in square brackets and listed at the 
end of the article. 

Ali Figures (charts, diagrams and line drawings) and Plates 
(photographic images) should be submitted in both electronic form 
and hard copy originals. Figures should be of clear quality, black and 
white and numbered consecutively with arabic numerais. 

Electronic figures should be either copied and pasted or saved 
and imported from the origination software into a blank Micresoft 


Word document. Figures created in MS Powerpoint are also 
acceptable. Acceptable standard image formats are: .eps, .pdf, ai and 
-wmf. If you are unable to supply graphics in these formats then 
please ensure they are .tif, jpeg, .bmp, -pcx, .pic, .gif or .pct at a 
resolution of at least 300dpi and at least 10cm wide. To prepare 
screenshots simultaneously press the “Alt” and “Print screen” keys 
on the keyboard, open a blank Microsoft Word document and 
simultaneously press “Ctrl” and “V” to paste the image. (Capture all 
the contents/windowss on the computer screen to paste into MS Word, 
by simultaneously pressing “Ctrl” and “Print screen”) 

For photographic images (plates) good quality original 
photographs should be submitted. !f supplied electronically they 
should be saved as „tif or jpeg files at a resolution of at least 3oodpi 
and at least 10cm wide. Digital camera settings should be set at the 
highest resolution/quality as possible. 

In the text of the paper the preferred position of all figures and 
plates should be indicated by typing on a separate line the words 
“Take in Figure (No..” or “Take in Plate (No.)”. Supply succinct and 
clear captions for all figures and plates. 

Tables must be numbered consecutively with roman numerals and 
a brief title. In the text, the position of the table should be shown by 
typing on a separate line the words “Take in Table IV”. 

References to other publications must be in Harvard style and 
carefully checked for completeness, accuracy and consistency. This is 
very important in an electronic environment because it enables your 
readers to exploit the Reference Linking facility on the database and 
link back to the wor<s you have cited through CrossRef. You should 
include all author names and initials and'give any journal title in full. 

You should cite publications in the text: (Adams, 1997) using the 
first named author's name. At the end of the paper a reference list in 
alphabetical order should be supplied: 

For books: surnane, initials, (year), title of book, publisher, place 
of publication, e.g. Fallbright, A. and Khan, G. (2001), Competing 
Strategies, Outhouse Press, Rochester. 

For book chapters: surname, initials, (year), “chapter title”, 
editor's sumame, initials, tite of book, publisher, place of 
publication, pages, e.g. Bessley, M. and Wilson, P. (1999), “Marketing 
for the production manager”, in Levicki, }. (Ed.), Taking the Blinkers off 
Managers, Broom Reim, London, pp.29-33. 

For journals: surname, initials, (year), “title of article”, journal 
name, volume, number, pages, e.g. Greenwald, E. (2000), 
“Empowered to serve”, Management Decision, Vol. 33 No. 5, pp. 6-10. 

For electronic sources: if available online the full URL should be 
supplied at the end of the reference. 


Final submission of the article 

Once accepted for publication, the final version of the manuscript 
must be provided, accompanied by a 3.5" disk, Zip disk or CD-ROM of 
the same version lakelled with: disk format (Macintosh or PC); author 
name(s); title of article; journal title; file name. 

Alternatively, the editor may request the final version as an 
attached file to an e-mail. 

Each article must be accompanied by a completed and signed 
Journal Article Record Form availabie from the Editor or on www. 
emeraldinsight.com,;info/authors/writing_for_emerald/jarform.html 

The manuscript will be considered to be the definitive version of 
the article. The author must ensure that it is complete, grammatically 
correct and without spelling or typographical errors. 

In preparing the disk, please use orte of the following preferred 
formats: Word, Word Perfect, Rich text format or TeX/LaTeX. 

Technical assistance is available by contacting Mike Massey at 
Emerald. E-mail mmassey@emeraldinsight.com 
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